ISCA Archive SIGUL 2023
ISCA Archive SIGUL 2023

Multilingual Models with Language Embeddings for Low-resource Speech Recognition

Léa-Marie Lam-Yee-Mui, Waad Ben Kheder, Viet-Bac Le, Claude Barras, Jean-Luc Gauvain

Speech recognition for low-resource languages remains challenging and can be addressed with techniques such as multi- lingual modeling and transfer learning. In this work, we explore several solutions to the multilingual training problem: training monolingual models with multilingual features, adapting a multilingual model with transfer learning and using language em- beddings as additional features. To develop practical solutions we focus our work on medium size hybrid ASR models. The multilingual models are trained on 270 hours of iARPA Babel data from 25 languages, and results are reported on 4 Babel languages for the Limited Language Pack (LLP) condition. The results show that adapting a multilingual acoustic model with language embeddings is an effective solution, outperforming the baseline monolingual models, and providing comparable results to models based on state-of-the-art XLSR-53 features but with the advantage of needing 15 times fewer parameters.