Speaker: Tamas Endrei. Abstract: Although deep reinforcement learning (DRL) deals with sequential decision-making problems, temporal information representation is absent from state-of-the-art actor-critic algorithms. The reliance on only the current timestep information causes instability in concurrent actions. Furthermore, the over-reliance on… Read More
Enhancing Sound Event Detection and Speaker Verification employing weak supervision
Speaker: Sara Barahona Quirós. Abstract: In this seminar, we will explore approaches for training acoustic event detection and speaker verification systems employing limited labels. Specifically, for the first task, we will explain the optimization process of a system based on… Read More
Transformers for Binding Prediction of Hypoxia-Induced Factors
Speaker: Manuel Fernando Mollón Laorca. Abstract: Hypoxia-inducible factors (HIFs) are proteins that play a crucial role in the cellular response to low oxygen levels. Accurate prediction of the binding of these factors to their target DNA is essential for understanding… Read More
Whisper‑based spoken term detection systems for search on speech ALBAYZIN evaluation challenge
Speaker: Javier Tejedor Noguerales. Abstract: The vast amount of information stored in audio repositories makes necessary the development of efficient and automatic methods to search on audio content. In that direction, search on speech (SoS) has received much attention in… Read More
Road map for Albayzin Diarization Challenge 2024
Speaker: Jérémie Touati. Abstract: The diarization challenge of the 2024 Albayzin evaluation stands out by various difficulties. The recordings, which come from databases of Spanish radio and television programs, can last up to several hours, they contain an undetermined and… Read More
Introduction to the Language-Based Audio Retrieval task.
Speaker: Manuel Otero. Abstract: Language-Based Audio Retrieval is a task of the DCASE Challenge, which is based on the retrieval of audio information from natural language descriptions. Two of the best performing approaches in the state of the art will… Read More
Data Augmentation for Respiratory Cycle Classification
Speaker: Miguel Ángel. Abstract: Analysing respiratory audios in order to detect and classify adventitious respiratory sounds is of vital importance for the development of continuous monitoring tools for patients with respiratory diseases. The ICBHI 2017 database is the most widely… Read More
DeepMUSE Research Project granted to AUDIAS
A new research project from the Spanish Research Agency has been (provisionally) granted to AUDIAS. The project is entitled “DeepMUSE: Multi-task and Semi-supervised Deep Learning for Speech and Audio Processing”. It starts in 2022 and will end by the end… Read More
Beltrán Labrador Selected for a Summer Internship at Google Research NY
Beltrán Labrador Serrano has been recently selected for a summer internship at the Speech Processing group of Google Research in New York, USA. He will be starting the internship in May and returning to Madrid in September.
BigSSL: Large-Scale Semi-Supervised Learning for ASR
Speaker: Laura Herrera Abstract: This paper deals with results obtained on very large automatic speaker recognition models.A large amount of labelled data is not always available and sometimes they do not generalize enough. Consequently, the authors propose to use pre-trained… Read More