ISCA Archive IWSLT 2009
ISCA Archive IWSLT 2009

Online language model adaptation for spoken dialog translation

Germán Sanchis-Trilles, Mauro Cettolo, Nicola Bertoldi, Marcello Federico

This paper focuses on the problem of language model adaptation in the context of Chinese-English cross-lingual dialogs, as set-up by the challenge task of the IWSLT 2009 Evaluation Campaign. Mixtures of n-gram language models are investigated, which are obtained by clustering bilingual training data according to different available human annotations, respectively, at the dialog level, turn level, and dialog act level. For the latter case, clustering of IWSLT data was in fact induced through a comparable Italian-English parallel corpus provided with dialog act annotations. For the sake of adaptation, mixture weight estimation is performed either at the level of single source sentence or test set. Estimated weights are then transferred to the target language mixture model. Experimental results show that, by training different specific language models weighted according to the actual input instead of using a single target language model, significant gains in terms of perplexity and BLEU can be achieved.


Cite as: Sanchis-Trilles, G., Cettolo, M., Bertoldi, N., Federico, M. (2009) Online language model adaptation for spoken dialog translation. Proc. International Workshop on Spoken Language Translation (IWSLT 2009), 160-167

@inproceedings{sanchistrilles09_iwslt,
  author={Germán Sanchis-Trilles and Mauro Cettolo and Nicola Bertoldi and Marcello Federico},
  title={{Online language model adaptation for spoken dialog translation}},
  year=2009,
  booktitle={Proc. International Workshop on Spoken Language Translation (IWSLT 2009)},
  pages={160--167}
}