ISCA Archive ICSLP 2002
ISCA Archive ICSLP 2002

Model partial pronunciation variations for spontaneous Mandarin speech recognition

Yi Liu, Pascale Fung

Modeling pronunciation variations is a critical part of spontaneous Mandarin speech recognition. Such variations include both complete changes and partial changes. Complete changes can usually be modeled by using an alternate phone to replace the canonical phone. Partial changes, which cannot be modeled by conventional methods are variations within the phoneme and include diacritics. In this paper, we propose using partial change phone model (PCPM) as well as auxiliary decision tree to model partial changes. A detailed but robust model can be achieved by merging canonical model with PCPMs through Gaussian distribution reconstruction. The effectiveness of this approach was evaluated on the Hub4NE Mandarin Broadcast News Corpus. The syllable error rate decreased 2.39% absolutely with respect to the baseline.