Speaker: Juan Ignacio Álvarez Trejos. Abstract: This presentation covers the work presented at Odyssey 2024, focusing on speaker diarization in two-speaker scenarios. End-to-end neural speaker diarization systems are designed to handle overlapping speech while accurately distinguishing between speakers. In this… Read More
Transformers for Binding Prediction of Hypoxia-Induced Factors
Speaker: Manuel Fernando Mollón Laorca. Abstract: Hypoxia-inducible factors (HIFs) are proteins that play a crucial role in the cellular response to low oxygen levels. Accurate prediction of the binding of these factors to their target DNA is essential for understanding… Read More
Whisper‑based spoken term detection systems for search on speech ALBAYZIN evaluation challenge
Speaker: Javier Tejedor Noguerales. Abstract: The vast amount of information stored in audio repositories makes necessary the development of efficient and automatic methods to search on audio content. In that direction, search on speech (SoS) has received much attention in… Read More
Road map for Albayzin Diarization Challenge 2024
Speaker: Jérémie Touati. Abstract: The diarization challenge of the 2024 Albayzin evaluation stands out by various difficulties. The recordings, which come from databases of Spanish radio and television programs, can last up to several hours, they contain an undetermined and… Read More
Introduction to the Language-Based Audio Retrieval task.
Speaker: Manuel Otero. Abstract: Language-Based Audio Retrieval is a task of the DCASE Challenge, which is based on the retrieval of audio information from natural language descriptions. Two of the best performing approaches in the state of the art will… Read More
Data Augmentation for Respiratory Cycle Classification
Speaker: Miguel Ángel. Abstract: Analysing respiratory audios in order to detect and classify adventitious respiratory sounds is of vital importance for the development of continuous monitoring tools for patients with respiratory diseases. The ICBHI 2017 database is the most widely… Read More
Diarization Introduction & EEND Perceiver-based Diarization
Speaker: Alicia Lozano Díez. Abstract: In this talk, I will present an introduction of the speaker diarization task as well as the latest approaches based on neural networks as self-attention end-to-end neural diarization (EEND) with encoder-decoder attractors (EDA) as opposed… Read More
Introduction to Reinforcement Learning.
Speaker: Tamas Endrei. Abstract: Reinforcement learning (RL) has emerged as one of the most fascinating fields of machine learning, providing solutions to challenging problems ranging from complex robotics behaviors to optimizing neural network architectures. Despite its immense potential, RL’s complex… Read More
GPU Parallel Computing for Deep Learning
Speaker: Beltrán Labrador Serrano. Abstract: Large Language Models (LLMs) is transforming natural language processing and are now impacting speech processing. This talk addresses the challenge of training these massive neural networks required to follow this trend. I will present GPU… Read More
Rotary Position Embeddings (RoPE) in Transformers.
Speaker: Doroteo Torre Toledano. Abstract: Since Transformers were proposed in 2017, they have dominated the state-of-the-art in several domains including language modelling, speech processing, and even image processing. Although the main ideas of the original Transformers are essentially kept, there… Read More