ISCA Archive Interspeech 2025
ISCA Archive Interspeech 2025

Evaluating Speech Foundation Models for Automatic Speech Recognition in the Low-Resource Kanyen’kéha Language

Mengzhe Geng, Patrick Littell, Aidan Pine, Robbie Jimerson, Gilles Boulianne, Vishwa Gupta, Rolando Coto-Solano, Anna Kazantseva, Marc Tessier, Delaney Lothian, Akwiratékha' Martin, Eric Joanis, Samuel Larkin, Roland Kuhn

Despite recent progress in automatic speech recognition (ASR) and speech foundation models (SFMs) for widely spoken languages, their application to low-resource Indigenous languages remains limited. To this end, this paper presents a systematic evaluation of SFMs for ASR development in Kanyen'kéha, a polysynthetic Iroquoian language structurally and typologically distinct from mainstream languages. To address challenges posed by limited data and extensive vocabulary variation, we further investigate the impact of incorporating in-domain synthesized data and external language models during cross-lingual transfer learning. Experiments on the low-resource Kanyen'kéha corpus, under various train/test splits, show that the best system obtains a WER of 13.73% and a CER of 2.21% on the test set with a 59.2% OOV rate. Excluding easily correctable errors further reduces the WER and CER to 10.36% and 1.76%, demonstrating its potential to support language documentation and revitalization.