ISCA Archive SpeechProsody 2006
ISCA Archive SpeechProsody 2006

A new approach of using temporal information in Mandarin speech recognition

Jyh-Her Yang, Yuan-Fu Liao, Yih-Ru Wang, Sin-Horng Chen

In this paper, a new approach of using temporal information to assist in Mandarin speech recognition is discussed. It incorporates two types of temporal information into the recognition search. One is a statistical syllable duration model which considers the influences of 411 basesyllables, 5 tones, 4 position-in-word factors, and 3 positionin- sentence factors on syllable duration. Another is the timing information of modeling three types of inter-syllable boundary including intra-word, inter-word without punctuation mark (PM), and inter-word with PM. The uses of these two types of temporal information are expected to be useful for improving the segmentation accuracies in both acoustic decoding and linguistic decoding. Experimental results showed that the base-syllable/character/word recognition rates were slightly improved for both MATBN and Treebank datbase.