A new research project from the Spanish Research Agency has been (provisionally) granted to AUDIAS. The project is entitled “DeepMUSE: Multi-task and Semi-supervised Deep Learning for Speech and Audio Processing”. It starts in 2022 and will end by the end… Read More
Beltrán Labrador Selected for a Summer Internship at Google Research NY
Beltrán Labrador Serrano has been recently selected for a summer internship at the Speech Processing group of Google Research in New York, USA. He will be starting the internship in May and returning to Madrid in September.
BigSSL: Large-Scale Semi-Supervised Learning for ASR
Speaker: Laura Herrera Abstract: This paper deals with results obtained on very large automatic speaker recognition models.A large amount of labelled data is not always available and sometimes they do not generalize enough. Consequently, the authors propose to use pre-trained… Read More
Efficient Neural Approaches for Automatic Speech Recognition
Speaker: Doroteo Torre Toledano Abstract: Many different end-to-end neural approaches have been proposed in the last years in the field of automatic speech recognition (ASR). However, most of the research available compares systems only in terms of accuracy (word error… Read More
Structured Output Learning
Speaker: María Pilar Fernández Rodríguez Abstract: Speech applications dealing with conversations require not only recognizing the spoken words, but also determining who spoke when, the language, punctuation, capitalization… To deal with it, it is typically addressed by merging the outputs… Read More
Voxceleb Experiment: fairness
Speaker: Almudena Aguilera Abstract: The experiment is based on the dataset from Voxceleb [1], using the two pre-trained models. The main idea of these experiments was to study the fairness problems in different demographic groups present in the data base… Read More
Diego de Benito has returned from his research stay at the Speech@FIT research group of Brno University of Technology (BUT)
Diego de Benito Gorrón has been doing a research stay at the prestigious Speech@FIT group of Brno University of Technology (BUT) in the Czech Republic from September 2021 to December 2021. He has been doing research in acoustic source separation… Read More
Semi-Supervised Music Tagging Transformer
Speaker: David Martín Abstract: Music Tagging Transformer (MTT) was recently released in the latest ISMIR 2021 Conference as one of the most erupting deep learning approaches for Music Information Retrieval. It consists of a semi-supervised approach where the model captures… Read More
Encoder-Decoder Based Attractor Calculation for End-to-End Neural Diarization
Speaker: Alicia Lozano Díez Abstract: In this talk, we will deeply review the algorithms behind end-to-end systems for speaker diarization based on neural networks. In particular, we will describe how the encoder-decoder part of the model calculates “attractors” that capture… Read More
Unsupervised Sound Separation Using Mixture Invariant Training
Speaker: Diego de Benito Gorrón Abstract: In recent years, rapid progress has been made on the problem of single-channel sound separation using supervised training of deep neural networks. In such supervised approaches, a model is trained to predict the component… Read More