Speaker: Guillermo Recio Martín. Abstract: AVASpeech is a publicly available dataset created in 2018 to contribute to the task of speech activity detection (SAD) task. This dataset contains three different types of audio segments: clean speech, speech co-occuring with music… Read More
Sergio Álvarez Balanya selected for an intership in Amazon
Sergio Álvarez Balanya has been recently selected for a summer internship at Amazon, Barcelona, Spain. He will be starting the internship in June and returning to Madrid in December.
Conformer-based sound event detection with semi-supervised learning and data augmentation
Speaker: Sara Barahona Quirós. Abstract: This paper presents a Conformer-based sound event detection (SED) method, which uses semi-supervised learning and data augmentation. The proposed method employs Conformer, a convolution-augmented Transformer that is able to exploit local features of audio data… Read More
Speaker Diarization with Region Proposal Network
Speaker: Sergio Izquierdo del Álamo. Abstact: Speaker diarization is an important pre-processing step for many speech applications, and it aims to solve the “who spoke when” problem. Although the standard diarization systems can achieve satisfactory results in various scenarios, they… Read More