ISCA Archive Interspeech 2023
ISCA Archive Interspeech 2023

Adapter Incremental Continual Learning of Efficient Audio Spectrogram Transformers

Nithish Muthuchamy Selvaraj, Xiaobao Guo, Adams Kong, Bingquan Shen, Alex Kot

Efficient tuning of neural networks for continual learning with minimal computational resources remains a challenge. In this paper, we propose continual learning of audio classifiers with parameter and compute efficient Audio Spectrogram Transformers (AST). To reduce the trainable parameters without performance degradation we propose AST with Convolutional Adapter, which has less than 5% of trainable parameters of full fine-tuning. To reduce the computational complexity of self-attention, we introduce a novel Frequency-Time factorized Attention (FTA) method that achieves competitive performance with only a factor of the computations. Finally, we formulate our method called Adapter Incremental Continual Learning (AI-CL), as a combination of the parameter-efficient Convolutional Adapter and the compute-efficient FTA. Experiments on ESC-50, SpeechCommandsV2, and Audio-Visual Event benchmarks show that our proposed method efficiently learns new tasks and prevents catastrophic forgetting.