Speaker: Joaquín González Rodríguez. Abstract: This talk is based on: Parametric Wave Field Coding for Precomputed Sound Propagation (ACM Transactions on Graphics, Vol. 33, No. 4, Article 38, Publication Date: July 2014) Parametric Directional Coding for Precomputed Sound Propagation (ACM… Read More
Breath cycle detection in respiratory audios
Speaker: Miguel Ángel Martínez Pay. Abstract: Neural networks applied to the detection of acoustic events in respiratory audios. Introduction to the ICBHI 2017 database dedicated to the classification of respiratory cycles into “normal”, “with crackles”, “with wheezes”, “with both”. Main… Read More
Alicia Lozano-Diez selected for a MSCA grant for an intership at MIT
Alicia Lozano-Diez will spend three months as a researcher at the Massachusetts Institute of Technology (MIT, Cambridge, MA, USA) within the GHAIA (Geometric and Harmonic Analysis with InterdisciplinaryApplications) project of the H2020-MSCA-RISE program of the European Commission. She will work… Read More
PhysioNet Challenge 2016: Classification of Heart Sound Recordings
Speaker: Javier Galán Fernández. Abstract: Cardiovascular diseases are the leading cause of death in the world, accounting for 32% of all deaths recorded throughout the year. The 2016 PhysioNet challenge aimed to encourage the development of algorithms to classify heart… Read More
How speaker diarization evolved recently: from clustering to end-to-end approaches
Speaker: Alicia Lozano Díez. Abstract: Speaker diarization systems aim to segment a multi-speaker audio recording according to speaker changes, providing the time stamps of the activity of each speaker, including segments where nobody speaks and those where more than one… Read More
VoxCeleb-Spain: Design, Acquisition and Preliminar Evaluation
Speaker: Manuel Otero González. Abstract: Description of VoxCeleb and its latest Challenges (2019-2022), elaboration and capture of audio database of celebrities of Spanish nationality, and preliminary evaluation of a pre-trained system with the acquired data.
MusicLM: Generating music from text
Speaker: Laura Herrera Alarcón Abstract: Based on https://arxiv.org/pdf/2301.11325.pdf. This paper presents a new model for generating high-fidelity music from text descriptions. It combines SoundStream, w2v-BERT and MuLan, 3 models that allow to obtain temporal coherence and high quality audios of… Read More
Iterative psuedo-forced alignment tool
Speaker: W. Fernando López Gavilánez. Abstract: High-quality data labeling from specific domains is costly and human time-consuming. In this work, we propose an iterative pseudo-forced alignment algorithm for long audio files with low-quality transcriptions. The alignments are iteratively done by… Read More
Differentially Private Fine-Tuning for Language Models
Speaker: Beltrán Labrador Serrano. Abstract: Based on https://arxiv.org/abs/2110.06500. In this talk we will comment the paper Differentially Private Fine-Tuning for Language Models, where the authors give simpler, sparser, and faster algorithms for differentially private fine-tuning of large-scale pre-trained language models,… Read More
Conformer Architecture for Sound Event Detection (DCASE)
Speaker: Sara Barahona Quirós. Abstract: Sound Event Detection is the task that is focused on automatizing the human’s ability of recognizing sound events in the environment. Over the last years, the creation of evaluations such as the Detection and Classification… Read More