Text corpus size is an important issue when building a language model (LM). This is a particularly important issue for languages where little data is available. This paper introduces an LM adaptation technique to improve an LM built using a small amount of task dependent text with the help of a machine-translated text corpus. Icelandic word error rate experiments were performed using data, machine translated (MT) from English to Icelandic on a sentenceby- sentence and word-by-word basis. The baseline word error rate was 49.6%. LM interpolation using the baseline LM and an LM built from sentence-by-sentence translated text reduced the word error rate significantly to 41.9%.
Index Terms— LanguageModel Adaptation, Automatic Speech Recognition, Machine Translation, Sparse Text Corpus, Resource Deficient Languages.