Speaker: Laura Herrera
Abstract: This paper deals with results obtained on very large automatic speaker recognition models.
A large amount of labelled data is not always available and sometimes they do not generalize enough. Consequently, the authors propose to use pre-trained and self-trained models with unlabeled data obtained from Youtube. The results obtained scaling up model size, greatly increases data efficiency. However, when the labeled dataset size grows very large, the effect become smaller.