ISCA Archive Interspeech 2022
ISCA Archive Interspeech 2022

Cross-Layer Similarity Knowledge Distillation for Speech Enhancement

Jiaming Cheng, Ruiyu Liang, Yue Xie, Li Zhao, Björn Schuller, Jie Jia, Yiyuan Peng

Speech enhancement (SE) algorithms based on deep neural networks (DNNs) often encounter challenges of limited hardware resources or strict latency requirements when deployed in real-world scenarios. However, a strong enhancement effect typically requires a large DNN. In this paper, a knowledge distillation framework for SE is proposed to compress the DNN model. We study the strategy of cross-layer connection paths, which fuses multi-level information from the teacher and transfers it to the student. To adapt to the SE task, we propose a frame-level similarity distillation loss. We apply this method to the deep complex convolution recurrent network (DCCRN) and make targeted adjustments. Experimental results show that the proposed method considerably improves the enhancement effect of the compressed DNN and outperforms other distillation methods.