ISCA Archive ICSLP 1996
ISCA Archive ICSLP 1996

Speaker-independent dictation of Chinese speech with 32k vocabulary

Bo Xu, Bing Ma, Shuwu Zhang, Fei Qu, Taiyi Huang

While early machines adopted isolated syllable as input units and needed boring enrollment, our research focus on the speaker-independent, word-based dictation. A deliberately designed 120-speaker database was built for training ; inter-syllable context ,tonal and endpoint dependent acoustic model are applied with promising MFCC feature; Two-pass acoustic matching accelerates the recognition making fully advantage of the monosyllabic structure of Chinese speech; A complete word bigram and trigram serve as language processing module. With all efforts, the system reaches 90% character accuracy performing in almost real-time on Pentium PC without DSP help.