Speaker: Doroteo Torre Toledano.
Abstract: Very recently (in Sept 2022) OpenAI has made freely available a speech recognition neural network called Whisper. One of the main differences with respect to the current state of the art is the use of a huge amount of training for which transcription is only weakly supervised. This model has shown unprecedented accuracy and robustness across a wide variety of databases and conditions. In this talk we will analyze the main features of this model and the implications of the results achieved.