Speaker: W. Fernando López Gavilanez. Abstract: Motivated by the lack of high-quality labeled data for specific scenarios, such as emergencies in the home environment, we explored a CTC-segmentation method to generate a specific-purpose speech dataset. The project seeks the quality improvement of… Read More
Beltrán Labrador Selected for a Summer Internship at Google Research NY
Beltrán Labrador Serrano has been recently selected for a summer internship at the Speech Processing group of Google Research in New York, USA. He will be starting the internship in May and returning to Madrid in September.
BigSSL: Large-Scale Semi-Supervised Learning for ASR
Speaker: Laura Herrera Abstract: This paper deals with results obtained on very large automatic speaker recognition models.A large amount of labelled data is not always available and sometimes they do not generalize enough. Consequently, the authors propose to use pre-trained… Read More
Efficient Neural Approaches for Automatic Speech Recognition
Speaker: Doroteo Torre Toledano Abstract: Many different end-to-end neural approaches have been proposed in the last years in the field of automatic speech recognition (ASR). However, most of the research available compares systems only in terms of accuracy (word error… Read More
Structured Output Learning
Speaker: María Pilar Fernández Rodríguez Abstract: Speech applications dealing with conversations require not only recognizing the spoken words, but also determining who spoke when, the language, punctuation, capitalization… To deal with it, it is typically addressed by merging the outputs… Read More
Voxceleb Experiment: fairness
Speaker: Almudena Aguilera Abstract: The experiment is based on the dataset from Voxceleb [1], using the two pre-trained models. The main idea of these experiments was to study the fairness problems in different demographic groups present in the data base… Read More