ISCA Archive ECST 1987
ISCA Archive ECST 1987

Compensating for vowel coarticulation in continuous speech recognition

James L. Hieronymus

Coarticulation alters the vowel formant characteristics in continuous speech. Studies of isolated monosyllables in the literature suggest that some phonemes cause more severe distortions than others. The largest changes are caused by /r/, /I/, /w/. Unstressed vowels are most affected. Previous studies by Holmes [6] and by us [13] indicate that these effects are even larger for continuous speech. Vowel recognition algorithms which do not take context into account in continuous speech normally achieve correct recognition of approximately 75 % for the three top choices from the recognizer. By developing methods which explicitly model the phonetic context, higher levels of performance are expected to be achieved. An ongoing study is being made of all 16 of the American English vowels using a subset the DARPA acoustic-phonetic data base, a phonetically labeled 6300 sentence data base with 630 talkers. A subset of 7 vowels /iy/, /I/, /eh/, /ae/, /o/ and /u/ have been studied in all major contexts. The formant temporal patterns are being examined for phoneme triples and quintuples with a vowel in the center. These formant patterns are discussed along with some effects of stress and speaking rate.