ISCA Archive Interspeech 2025
ISCA Archive Interspeech 2025

Teacher-Free Knowledge Distillation for Improving Short-Utterance Spoken Language Identification

Spandan Dey, Hirak Mondal, Sanjay Kumar Kurmi

Spoken language identification (LID) systems exhibit performance degradations as the test input duration reduces. To delve deeper, we show that 36.94% of the misclassifications on 2-second (s) LID inputs occur due to out-of-scope elements like non-speech, named entities, filler words, and overlapped speech. To mitigate this, we propose a teacher-free knowledge distillation (TF-KD) using online label smoothing. This method accumulates prediction logits of correctly classified training segments from the preceding epoch and uses them as soft-labels for distillation in the next epoch. We further enhance TF-KD with dynamic weights, conditional label update, and entropy-based soft-label computation. Compared to existing KD-based solutions for 2s inputs, our approach achieves consistent Cavg improvements for both same-corpora and cross-corpora evaluations without training a separate teacher network.