Speaker: William Fernando López Abstract: Wake-up-word (WuW) detection is a fundamental component in voice assistants. Undesired activation of the device is often due to external noises such as background conversations, TV or music. In Telefónica we have been working on… Read More
Unsupervised pre-training for learning speech representations: Wav2Vec and Wav2Vec2.0
Speaker: Laura Herrera Abstract: These papers (https://arxiv.org/pdf/1904.05862.pdf and https://arxiv.org/pdf/2006.11477.pdf) explore unsupervised learning from raw audio for speech recognition.A large amount of labelled data is not always available, consequently wav2vec uses a causal convolutional network trained with large amounts of unlabelled… Read More
Large-scale pre-training of End-to-End Multi-Talker ASR for meeting Transcription with Single Distant Microphone
Speaker: María Pilar Fernández Gallego Abstract: Transcribing meetings containing overlapped speech with only a single distant microphone (SDM) has been one of the most challenging problems for automatic speech recognition (ASR). While various approaches have been proposed, all previous studies… Read More