Speaker: Sara Barahona Quirós.

Abstract: Sound Event Detection is the task that is focused on automatizing the human’s ability of recognizing sound events in the environment. Over the last years, the creation of evaluations such as the Detection and Classification of Acoustic Scenes and Events Challenge (DCASE) has not only impulse the research on this area but has also set a benchmark. In 2020, a system based on the recently proposed Conformer architecture won the Task 4 of this challenge. Although this new architecture appeared to have set state-of-the-art results on this area, with the novelties introduced in the DCASE evaluation during 2021, the system did not perform as expected in the two scenarios proposed for the Polyphonic Sound Event Detection. Therefore, in this talk we will discuss the possible motives of this downgrade in the performance, specially for the PSDS 1 which is the most affected metric.