ISCA Archive Interspeech 2010
ISCA Archive Interspeech 2010

Named-entity projection and data-driven morphological decomposition for field maintainable speech-to-speech translation systems

Ian R. Lane, Alex Waibel

In this paper, we investigate methods to improve the handling of named-entities in speech-to-speech translation systems, specifically focusing on techniques applicable to under-resourced, morphologically complex languages. First, we introduce a method to efficiently bootstrap a named-entity recognizer for a new language by projecting tags from a well resourced language across a bilingual corpus; and second, we propose a novel approach to automatically induce decomposition rules for morphologically complex languages. In our English-Iraqi speech-to-speech translation system combining these two approaches significantly improved speech recognition and translation performance on military dialogs focused on the collection of information in the field.