Speaker: Pilar Fernández Gallego

Abstract: Nowadays ASR (Automatic Speech Recognition) systems have dramatically improved, due both to advances in deep learning and to the collection of large datasets used to train the systems. However, it has been demonstrated that some of the most famous developed by Google, Amazon, Microsoft,… do not work equally well for all subgroups of the population, showing large differences in terms of accuracy depending on the age, gender, race, accents, and even the socio-economic status of the speakers…In short, there are many factors causing a bias and weaknesses in ASR systems..
On other hand, the last year Whisper was launched by OpenAI with great results without the need to do fine tuning to use it in a specific field. But has it bias for some subgroups of the population? In this study we are going to evaluate it and compare it with one of the most downloaded models in Hugging Face.