Speaker: Clara Adsuar Ávila. Abstract: In this project, we address the importance of enhancing the accessibility and usefulness of Deep Learning technologies for non-standard speakers. From a linguistic perspective, rural Spanish areas are rich in dialectal variety. However, most technology… Read More
Evaluating Posterior Probabilities: Decision Theory, Proper Scoring Rules, and Calibration
Speaker: Daniel Ramos Castro. Abstract: Most machine learning classifiers are designed to output posterior probabilities for the classes given the input sample. These probabilities may be used to make the categorical decision on the class of the sample; provided as… Read More
One model to rule them all? Towards end-to-end joint speaker diarization and speech recognition
Speaker: Laura Herrera Alarcón. Abstract: This paper presents a novel framework for joint speaker diarization (SD) and automatic speech recognition (ASR), named SLIDAR (sliding-window diarization-augmented recognition). SLIDAR can process arbitrary length inputs and can handle any number of speakers, effectively… Read More
Emotion recognition in Spanish audio
Speaker: Manuel Otero González. Abstract: En esta charla se explicará la tarea de reconocimiento de emociones en audios en español, presentando los enfoques más avanzados del estado del arte, como Wav2Vec2 y W2V-Bert. Además, se introducirá el reto EmoSPeech, cuyo… Read More