Speaker: Jérémie Touati. Abstract:

The diarization challenge of the 2024 Albayzin evaluation stands out by various difficulties. The recordings, which come from databases of Spanish radio and television programs, can last up to several hours, they contain an undetermined and considerable number of speakers, as well as non-negligeable overlapping speech, music and noise. Several approaches will be considered and combined together to tackle the challenge, including old-fashioned cascaded systems and newer end-to-end neural diarization (EEND) systems. We will attach a particular emphasis to the fine-tuning of the DiaPer model, which seems to be both the most recent and the most promising architecture to overcome such specificities.