ISCA Archive Interspeech 2022
ISCA Archive Interspeech 2022

Plugging a neural phoneme recognizer into a simple language model: a workflow for low-resource setting

Séverine Guillaume, Guillaume Wisniewski, Benjamin Galliot, Minh-Châu Nguyên, Maxime Fily, Guillaume Jacques, Alexis Michaud

Recently, several works have shown that fine-tuning a multilingual model of speech representation (typically XLS-R) with very small amounts of annotated data allows for the development of phonemic transcription systems of sufficient quality to help field linguists in their efforts to document the languages of the world. In this work, we explain how the quality of these systems can be improved by a very simple method, namely integrating them with a language model. Our experiments on an endangered language, Japhug (Trans-Himalayan/Tibeto-Burman), show that this appr oach can significantly reduce the WER, reaching the stage of automatic recognition of entire words.