Knowledge Distillation to Compress and Accelerate Large Models

Speaker: Laura Herrera Alarcón.

Abstract:

These papers present the idea of Knowledge Distillation, a method to compress and accelerate large models with high computational and storage cost. Thanks to this, these models can be used for real-time applications or in devices with limited resources. A summary of the main types of Knowledge Distillation is presented. In addition, One-Step Knowledge Distillation and fine-tuning (OS-KDFT), which incorporates KD and fine-tuning for the speaker verification task, is presented. This paper illustrates the effectiveness of this method for state-of-the-art models such as Wav2Vec2.0 or HuBERT.

References:
https://arxiv.org/pdf/2006.05525.pdf
https://arxiv.org/pdf/2305.17394.pdf