ISCA Archive Interspeech 2007
ISCA Archive Interspeech 2007

Improvements in machine translation for English/iraqi speech translation

S. Saleem, K. Subramanian, R. Prasad, David Stallard, Chia-Lin Kao, P. Natarajan, R. Suleiman

In this paper, we describe techniques for improving machine translation quality in the context of speech-to-speech translation for significantly different language pairs. Specifically, we explore three broad approaches for improving translation from English to Iraqi and vice versa. First, we investigate normalization techniques which address the differences in spoken and written forms of both languages. Second, we incorporate additional knowledge sources into the translation process such as a bilingual lexicon and named entity detection. Third, we exploit the rich morphological structure of Iraqi Arabic using two different approaches. The first approach decomposes words in Iraqi Arabic whereas the second approach, a novel one inflects English by combining key phrases into words using the minimum descriptive length criterion. Significant gains in accuracy are observed, while translating from text as well as speech recognition output.