Speaker: Doroteo Torre Toledanos. Abstract: Language-based audio retrieval is the task of retrieving audio segments containing sound described in a natural language text. This task was first proposed in a DCASE Challenge in 2022 as a subtask of the audio… Read More
Development of a Guardrail System for Bank Movement Assistant
Speaker: Miguel Ángel Martínez Pay. Abstract: This seminar outlines the process of creating a guardrail for a banking transactions assistant. The guardrail acts as a security system that filters user queries, determining which can be processed by the assistant and… Read More
Neural Discrete Representation Learning Revisited: Applications of VQ-VAE
Speaker: Manuel Fernando Mollón Laorca. Abstract: Since the publication of Neural Discrete Representation Learning in 2018, Vector Quantized Variational Autoencoders (VQ-VAEs) have gained significant attention for their ability to bridge continuous and discrete representations. In particular, their integration with transformer… Read More
You Only Hear Once: A YOLO-like Algorithm for Audio Segmentation and Sound Event Detection
Speaker: Rosa María Hornero Romera. Abstract: Presentation of paper https://arxiv.org/abs/2109.00962 Audio segmentation and sound event detection are essential aspects of machine listening, focusing on identifying acoustic classes and their boundaries. These tasks play a key role in applications such as… Read More
The Expected Cost: One Performance Metric to Rule Them All
Speaker: Daniel Ramos Castro. Abstract: Based on https://openreview.net/forum?id=3mN9QNWArl. Abstract of original paper: “The expected cost (EC) is one of the main classification metrics introduced in statistical and machine learning books. It is based on the assumption that, for a given… Read More
Cybersecurity Today: Attackers and Defenders
Speaker: Pablo González Escribano. Abstract: As cyber threats continue to evolve at a rapid pace, understanding the tactics, techniques, and procedures (TTPs) employed by attackers is crucial for enhancing defense strategies. In this session, we explored the current landscape of… Read More
Real-time Detection of Synthetic Speech
Speaker: William Fernando López Gavilánez Abstract: Advances in speech synthesis technology have facilitated numerous beneficial applications. However, they also pose significant threats, especially in the realm of identity spoofing. The study explores the potential of leveraging complex spectrograms for real-time… Read More
Past, present and ¿future? of Scaling Laws for Neural Language Models
Speaker: Beltrán Labrador Abstract: This presentation examines the scaling laws for neural networks that were foundational to the development of modern, large-scale language models. It revisits the 2020 OpenAI paper that established a key principle: model performance scales predictably with… Read More
Emotion Recognition Based On Speech Analysis For The EmoSPeech 2024 Challenge
Speaker: Manuel Otero Abstract: This master’s thesis addresses the analysis and recognition of emotions in speech, within the framework of the EmoSPeech 2024 challenge. Different approaches to the state of the art are presented, from traditional methods to current models… Read More
Deep Learning Insights Inspired by Reinforcement Learning Research
Speaker: Tamas Endrei. Abstract: Despite deep reinforcement learning being around for more than 10 years, traditional deep learning best practices have largely avoided the field until now. This talk elaborates on deep learning techniques uncovered through RL-motivated research, touching on… Read More