Speaker: Arturo Domínguez Santos. Abstract: This Master’s thesis addresses the challenge of investigating how emotions affect speakerverification and proposes a system that integrates this emotional variability to try toimprove accuracy. The focus is on the speaker’s emotions, which has traditionally… Read More
Exploring Speech Foundation Models for End-to-End Speaker Diarization
Speaker: Laura Herrera Alarcón. Abstract: In this Master’s Thesis the use of pre-trained models for the diarization task has beenstudied in order to exploit their ability to extract robust and discriminative features.In particular, the WavLM model has been combined with… Read More
Interpretation of fingerprint evidence with likelihood ratios (LRs – Likelihood ratios)
Speaker: Joaquín González Rodríguez. Abstract: The forensic fingerprint identification process based on the ACE-V method, widely implemented, makes absolute identification or exclusion decisions that depend on opinions that vary from expert to expert (for example, whether we consider an observed… Read More
Diarization Introduction & EEND Perceiver-based Diarization
Speaker: Alicia Lozano Díez. Abstract: In this talk, I will present an introduction of the speaker diarization task as well as the latest approaches based on neural networks as self-attention end-to-end neural diarization (EEND) with encoder-decoder attractors (EDA) as opposed… Read More
Introduction to Reinforcement Learning.
Speaker: Tamas Endrei. Abstract: Reinforcement learning (RL) has emerged as one of the most fascinating fields of machine learning, providing solutions to challenging problems ranging from complex robotics behaviors to optimizing neural network architectures. Despite its immense potential, RL’s complex… Read More
GPU Parallel Computing for Deep Learning
Speaker: Beltrán Labrador Serrano. Abstract: Large Language Models (LLMs) is transforming natural language processing and are now impacting speech processing. This talk addresses the challenge of training these massive neural networks required to follow this trend. I will present GPU… Read More
Rotary Position Embeddings (RoPE) in Transformers.
Speaker: Doroteo Torre Toledano. Abstract: Since Transformers were proposed in 2017, they have dominated the state-of-the-art in several domains including language modelling, speech processing, and even image processing. Although the main ideas of the original Transformers are essentially kept, there… Read More
Large Language Models in Protein Engineering
Speaker: Natalia Pinto Estéban. Abstract: The intersection of artificial intelligence and protein engineering represents an innovative frontier in scientific exploration. In this presentation, titled ‘Large Language Models in Protein Engineering,’ we delve into the field of advanced language models, focusing… Read More
Lute and vihuela in the Renaissance period: instruments and music
Speaker: Joaquín González Rodríguez. Abstract: In this talk we will present an overview of two extremely popular plucked musical instruments in XVI century in Europe, the Lute and its Spanish version the Vihuela. Sharing a common tuning and playing characteristics… Read More
DiarizationLM: speaker diarization post-processing with large language models
Speaker: Laura Herrera Alarcón. Abstract: This paper presents a framework designed to post-process the outputs of speaker diarization systems using large language models (LLM). The framework aims to enhance the readability of the diarized transcripts and reduce the WDER. For… Read More