ISCA Archive Interspeech 2005
ISCA Archive Interspeech 2005

Effective topic-tree based language model adaptation

Javier Dieguez-Tirado, Carmen García Mateo, Antonio Cardenal-Lopez

We work on adaptation schemes for language modeling well suited for limited resources scenarios. In order to take advantage of available out-of-domain corpora, language model adaptation using topic mixtures was investigated. This technique has not given good practical results in the past. In this paper, we have performed several modifications to an existing tree-based approach. The tree was obtained from the background corpus by means of partitional clustering. All the nodes were exploited in the adapted model, and non-erroneous in-domain transcriptions were used as the adaptation corpus. The modified technique yielded a 14% perplexity improvement in a bilingual BN task, outperforming several nonhierarchical approaches. A strategy for an early application of the language model allowed to translate this perplexity improvement into a 4% WER reduction.

doi: 10.21437/Interspeech.2005-19

Cite as: Dieguez-Tirado, J., Mateo, C.G., Cardenal-Lopez, A. (2005) Effective topic-tree based language model adaptation. Proc. Interspeech 2005, 1289-1292, doi: 10.21437/Interspeech.2005-19

  author={Javier Dieguez-Tirado and Carmen García Mateo and Antonio Cardenal-Lopez},
  title={{Effective topic-tree based language model adaptation}},
  booktitle={Proc. Interspeech 2005},