ISCA Archive Interspeech 2023
ISCA Archive Interspeech 2023

CLRL-Tuning: A Novel Continual Learning Approach for Automatic Speech Recognition

Zhihan Wang, Feng Hou, Ruili Wang

In this paper, we propose a novel Continual Learning approach, which is Randomly Layer-wise Tuning (CLRL-Tuning) of a pre-trained Automatic Speech Recognition (ASR) model. CLRL-Tuning tackles the randomness of subsequent datasets by updating the parameters of randomly selected encoder layers of the pre-trained model (such as wav2vec 2.0) for every training epoch. CLRL-Tuning is different from the previous approaches in that it neither uses previous datasets, nor expands/runs previous models. Furthermore, we perform experiments to evaluate our approach compared with four strong baselines, including Knowledge Distillation and Gradient Episodic Memory. Our approach achieves significant improvements over the baselines in average word error rate (WER) for the wav2vec 2.0 model. Additionally, we implement ablation studies for our approach by tuning one, three, six and full encoder layers of the model, and experimental results show only tuning one encoder layer of the model at each training epoch is the most effective way to mitigate catastrophic forgetting.