Speaker: Laura Herrera Abstract: These papers (https://arxiv.org/pdf/1904.05862.pdf and https://arxiv.org/pdf/2006.11477.pdf) explore unsupervised learning from raw audio for speech recognition.A large amount of labelled data is not always available, consequently wav2vec uses a causal convolutional network trained with large amounts of unlabelled… Read More
Large-scale pre-training of End-to-End Multi-Talker ASR for meeting Transcription with Single Distant Microphone
Speaker: María Pilar Fernández Gallego Abstract: Transcribing meetings containing overlapped speech with only a single distant microphone (SDM) has been one of the most challenging problems for automatic speech recognition (ASR). While various approaches have been proposed, all previous studies… Read More
Selective Kernel Networks
Speaker: Sergio Segovia Abstract: It is well-known in the neuroscience community that the receptive field size of visual cortical neurons are modulated by the stimulus, which has been rarely considered in constructing CNNs. We propose a dynamic selection mechanism in… Read More
Calibration of Multiclass Probabilistic Classifiers
Speaker: Sergio Márquez Abstract: Today’s Deep Neural Networks (DNNs) are used for numerous classification tasks, achieving high performance in terms of accuracy. In some cases, probabilistic classifiers, which assign a confidence value to each of the predictions made, are used.… Read More
Deep Learning Models with Self-Attention for the Detection of Audio Events
Speaker: Julio González Abstract: This talk is a presentation of the BsC Thesis “Modelos de aprendizajeprofundo con auto-atención para detección de eventos de audio”. Itdescribes the implementation of the Transformer and Conformer neuralnetworks and presents the results of the test… Read More
End-to-end Speaker Diarization
Speaker: Alicia Lozano Diez Abstract: In this talk, I will describe new approaches to the task of speaker diarization based on end-to-end neural networks, which present several advantages with respect to traditional systems based on clustering of speaker embeddings. I… Read More
Normalizing Flows for calibration of multiclass probabilistic classifiers
Speaker: Sergio Márquez Abstract: Today’s Deep Neural Networks (DNNs) have achieved high performance in accuracy, far exceeding the ones used ten years ago. Nevertheless, the outputs provided by these modern networks are less well calibrated, becoming a major problem in… Read More
Transfer Learning from computer vision to audio event detection
Speaker: Sergio Segovia Abstract: A brief summary about my lecture, in relation to my doctorate we are exploring the idea of applying the transfer learning technique between the domain of computer vision to the objective of detecting acoustic events. The… Read More
Modeling Uncertainty with Bayesian Neural Networks
Speaker: Sergio Álvarez Abstract: Deep Neural Networks (DNNs) have revolutionized many fields in pattern recognition like speech recognition and object detection. There are, however, some applications in which Neural Networks struggle to offer competitive performance, mainly sensitive ones. These applications… Read More
New loss function to improve calibration with mixup
Speaker: Juan Maroñas Molano Abstract: Deep Neural Networks (DNN) represent the state of the art in many tasks. However, due to their overparameterization, their generalization capabilities are in doubt and still a field under study. Consequently, DNN can overfit and… Read More