ISCA Archive Eurospeech 1995
ISCA Archive Eurospeech 1995

An approach to language identification with enhanced language model

Yonghong Yan, Etienne Barnard

An approach to Language Identification (LID) based on language-dependent phone recognition is presented. This LID system is designed to exploit varying phonotactic constraints of different languages. Based on the output of language-dependent phone recognizers, various LID features are extracted. Two methods are proposed to enhance the language modeling accuracy, (1) language models based on forward and backward bigrams, and (2) back-propagation based language model optimization. The system was evaluated on a standard 11-language task and a standard nine-language task. The results (correct rate) reached 87.6% for 45-second long utterances and 73.6% for 10-second long utterances for the 11-language task, and reached 87.8% and 74.0% respectively on the nine-language task. By adding channel normalization, the performance of our best systems was further improved to 90.8% and 77.1% for the 11-language task, and 91.1% and 77.5% on the nine-language task.