ISCA Archive Eurospeech 1997
ISCA Archive Eurospeech 1997

MDI adaptation of language models across corpora

P. Srinivasa Rao, Satya Dharanipragada, Salim Roukos

The amount of text data available from a corpus for training language models is usually limited. Data from larger general or related corpora can be utilized to improve the performance of the language model on the corpus of interest. We explore one method of adapting a prior model from a large corpus to a smaller one of interest. Perplexity results of adapting a prior model constructed using the NAB corpus to the Switchboard and ATIS corpora are presented and compared with those of interpolated models.