ISCA Archive Interspeech 2005
ISCA Archive Interspeech 2005

Trigger-based language model adaptation for automatic meeting transcription

Carlos Troncoso, Tatsuya Kawahara

We present a novel trigger-based language model adaptation method oriented to the transcription of meetings. In meetings, the topic is focused and consistent throughout the whole session, therefore keywords can be correlated over long distances. The trigger-based language model is designed to capture such longdistance dependencies, but it is typically constructed from a large corpus, which is usually too general to derive task-dependent trigger pairs. In the proposed method, we make use of the initial speech recognition results to extract task-dependent trigger pairs and to estimate their statistics. Moreover, we introduce a back-off scheme that also exploits the statistics estimated from a large corpus. The proposed model reduced the test-set perplexity twice as much as the typical trigger-based language model constructed from a large corpus, and achieved a remarkable perplexity reduction of 41% over the baseline when combined with an adapted trigram language model.

doi: 10.21437/Interspeech.2005-21

Cite as: Troncoso, C., Kawahara, T. (2005) Trigger-based language model adaptation for automatic meeting transcription. Proc. Interspeech 2005, 1297-1300, doi: 10.21437/Interspeech.2005-21

  author={Carlos Troncoso and Tatsuya Kawahara},
  title={{Trigger-based language model adaptation for automatic meeting transcription}},
  booktitle={Proc. Interspeech 2005},