ISCA Archive Eurospeech 1997
ISCA Archive Eurospeech 1997

N-gram language model adaptation using small corpus for spoken dialog recognition

Akinori Ito, Hideyuki Saitoh, Masaharu Katoh, Masaki Kohda

This paper describes an N-gram language model adaptation technique. As an N-gram model requires a large size sample corpus for probability estimation, it is difficult to utilize N-gram model for a specific small task. In this paper, N-gram task adaptation is proposed using large corpus of the general task (TI text) and small corpus of the specific task (AD text). A simple weighting is employed to mix TI and AD text. In addition to mix two texts, the effect of vocabulary is also investigated. The experimental results show that adapted N-gram model with proper vocabulary size has significantly lower perplexity than the task independent models.


doi: 10.21437/Eurospeech.1997-690

Cite as: Ito, A., Saitoh, H., Katoh, M., Kohda, M. (1997) N-gram language model adaptation using small corpus for spoken dialog recognition. Proc. 5th European Conference on Speech Communication and Technology (Eurospeech 1997), 2735-2738, doi: 10.21437/Eurospeech.1997-690

@inproceedings{ito97_eurospeech,
  author={Akinori Ito and Hideyuki Saitoh and Masaharu Katoh and Masaki Kohda},
  title={{N-gram language model adaptation using small corpus for spoken dialog recognition}},
  year=1997,
  booktitle={Proc. 5th European Conference on Speech Communication and Technology (Eurospeech 1997)},
  pages={2735--2738},
  doi={10.21437/Eurospeech.1997-690},
  issn={1018-4074}
}