In this paper, we present our latest investigations on pronunciation modeling and its impact on ASR. We propose completely automatic methods to detect, remove, and substitute inconsistent or flawed entries in pronunciation dictionaries. The experiments were conducted on different tasks, namely (1) word-pronunciation pairs from the Czech, English, French, German, Polish, and Spanish Wiktionary [1], a multilingual wiki-based open content dictionary, (2) our GlobalPhone Hausa pronunciation dictionary [2], and (3) pronunciations to complement our Mandarin-English SEAME code-switch dictionary [3]. In the final results, we fairly observed on average an improvement of 2.0% relative in terms of word error rate and even 27.3% for the case of English Wiktionary word-pronunciation pairs.
Index Terms: pronunciation dictionaries, automatic error recovery, multilingual speech recognition
s “Wiktionary - a wiki-based open content dictionary”, Website, http://www.wiktionary.org. Schlippe, T., Komgang Djomgang, E. G., Vu, N. T., Ochs, S., and Schultz, T., “Hausa Large Vocabulary Continuous Speech Recognition”, SLTU, 2012 Vu, T., Lyu, D.-C., Weiner, J., Telaar, D., Schlippe, T., Blaicher, F., Chng, E.-S., Schultz, T., and Li, H., “A First Speech Recognition System For Mandarin-English Code-Switch Conversational Speech”, ICASSP, 2012.