Speaker: Laura Herrera Abstract: These papers (https://arxiv.org/pdf/1904.05862.pdf and https://arxiv.org/pdf/2006.11477.pdf) explore unsupervised learning from raw audio for speech recognition.A large amount of labelled data is not always available, consequently wav2vec uses a causal convolutional network trained with large amounts of unlabelled… Read More
Large-scale pre-training of End-to-End Multi-Talker ASR for meeting Transcription with Single Distant Microphone
Speaker: María Pilar Fernández Gallego Abstract: Transcribing meetings containing overlapped speech with only a single distant microphone (SDM) has been one of the most challenging problems for automatic speech recognition (ASR). While various approaches have been proposed, all previous studies… Read More
Selective Kernel Networks
Speaker: Sergio Segovia Abstract: It is well-known in the neuroscience community that the receptive field size of visual cortical neurons are modulated by the stimulus, which has been rarely considered in constructing CNNs. We propose a dynamic selection mechanism in… Read More
Alicia Lozano Díez returns to UAM as Assistant Professor after almost two years in the prestigious research group Speech@FIT (Brno University of Technology, Czech Republic)
Alicia has made a postdoctoral research stay funded by the European Union under program H2020 Marie Slodowska-Curie Individual Fellowship. The project “Robust End-To-End SPEAKER recognition based on deep learning and attention models” (ETE SPEAKER, 843627) she has developed between 2019… Read More
Calibration of Multiclass Probabilistic Classifiers
Speaker: Sergio Márquez Abstract: Today’s Deep Neural Networks (DNNs) are used for numerous classification tasks, achieving high performance in terms of accuracy. In some cases, probabilistic classifiers, which assign a confidence value to each of the predictions made, are used.… Read More
Deep Learning Models with Self-Attention for the Detection of Audio Events
Speaker: Julio González Abstract: This talk is a presentation of the BsC Thesis “Modelos de aprendizajeprofundo con auto-atención para detección de eventos de audio”. Itdescribes the implementation of the Transformer and Conformer neuralnetworks and presents the results of the test… Read More
End-to-end Speaker Diarization
Speaker: Alicia Lozano Diez Abstract: In this talk, I will describe new approaches to the task of speaker diarization based on end-to-end neural networks, which present several advantages with respect to traditional systems based on clustering of speaker embeddings. I… Read More
Normalizing Flows for calibration of multiclass probabilistic classifiers
Speaker: Sergio Márquez Abstract: Today’s Deep Neural Networks (DNNs) have achieved high performance in accuracy, far exceeding the ones used ten years ago. Nevertheless, the outputs provided by these modern networks are less well calibrated, becoming a major problem in… Read More
Transfer Learning from computer vision to audio event detection
Speaker: Sergio Segovia Abstract: A brief summary about my lecture, in relation to my doctorate we are exploring the idea of applying the transfer learning technique between the domain of computer vision to the objective of detecting acoustic events. The… Read More
Modeling Uncertainty with Bayesian Neural Networks
Speaker: Sergio Álvarez Abstract: Deep Neural Networks (DNNs) have revolutionized many fields in pattern recognition like speech recognition and object detection. There are, however, some applications in which Neural Networks struggle to offer competitive performance, mainly sensitive ones. These applications… Read More