Speaker: Javier Tejedor Noguerales

Abstract:

Spoken term detection is the task of detecting terms (sequence of words) within audio archives. This task is suitable for accessing the information stored in audio repositories. This talk will present a spoken term detection system based on Whisper ASR for the COSER corpus, which represents the largest speech corpus that includes rural speech content for many regions in Spain. This corpus is quite challenging since it includes rural speech from elderly people, with all the dialects and languages spoken in Spain, and with overlapping speech between interviewer and interviewee.