ISCA Archive IWSLT 2009
ISCA Archive IWSLT 2009

The CASIA statistical machine translation system for IWSLT 2009

Maoxi Li, Jiajun Zhang, Yu Zhou, Chengqing Zong

This paper reports on the participation of CASIA (Institute of Automation Chinese Academy of Sciences) at the evaluation campaign of the International Workshop on Spoken Language Translation 2009. We participated in the challenge tasks for Chinese-to- English and English-to-Chinese translation respectively and the BTEC task for Chinese-to-English translation only. For all of the tasks, system performance is improved with some special methods as follows: 1) combining different results of Chinese word segmentation, 2) combining different results of word alignments, 3) adding reliable bilingual words with high probabilities to the training data, 4) handling named entities including person names, location names, organization names, temporal and numerical expressions additionally, 5) combining and selecting translations from the outputs of multiple translation engines, 6) replacing Chinese character with Chinese Pinyin to train the translation model for Chinese-to- English ASR challenge task. This is a new approach that has never been introduced before.


Cite as: Li, M., Zhang, J., Zhou, Y., Zong, C. (2009) The CASIA statistical machine translation system for IWSLT 2009. Proc. International Workshop on Spoken Language Translation (IWSLT 2009), 83-90

@inproceedings{li09_iwslt,
  author={Maoxi Li and Jiajun Zhang and Yu Zhou and Chengqing Zong},
  title={{The CASIA statistical machine translation system for IWSLT 2009}},
  year=2009,
  booktitle={Proc. International Workshop on Spoken Language Translation (IWSLT 2009)},
  pages={83--90}
}