ISCA Archive Interspeech 2024
ISCA Archive Interspeech 2024

Shared-Adapters: A Novel Transformer-based Parameter Efficient Transfer Learning Approach For Children’s Automatic Speech Recognition

Thomas Rolland, Alberto Abad

Automatic Speech Recognition (ASR) often faces challenges in processing children's speech due to data scarcity. Training large ASR models becomes particularly challenging in such scenarios. To mitigate this issue, fine-tuning is commonly employed, leveraging pre-trained adult models. However, fine-tuning large pre-trained models with limited data poses its own challenges. In response, this study investigates Parameter-Efficient Finetuning (PEFT) for children’s ASR. Various PEFT approaches are explored, with a specific emphasis on good ASR performance while minimising the number of parameters during training. Our investigation identifies residual Adapters as the most efficient technique. Moreover, motivated by Transformer-based model redundancies, we propose the Shared-Adapter and its highly parameter-efficient variant, the Light Shared-Adapter. Our findings demonstrate that Shared-Adapters strike an exceptional balance between recognition performance and parameter efficiency.