ISCA Archive Eurospeech 1997
ISCA Archive Eurospeech 1997

Acoustic and perceptual properties of phonemes in continuous speech as a function of speaking rate

Hisao Kuwabara

An investigation has been made for individual phonemes focusing mainly on their duration in continuous speech spoken in different rates: fast, normal, and slow. Fifteen short sentences uttered by four male speakers have been used as the speech material which comprises a total of 291 morae. Normal speaking rate (n-speech) is, on average, 150 milliseconds/mora (or 400 morae/minute) and the four speakers have been asked to read the sentences twice as fast as (f-speech) and 1/2 times as slow as (s-speech) the normal speed in reference to the n- speech. Among consonants, the greatest influence has been found to occur on the syllabic nasal /N/ and the least on the voiceless stop /t/ in f-speech. For the s-speech, /N/ has also been found to be the greatest but the least is voiced stop /d/. The ratio of duration between consonant and vowel of a CV-syllable in the f-speech is kept almost the same as that in the n-speech while vowel lengthening becomes significantly large in the s-speech. As it is expected, formant frequencies of vowels differ significantly between the three rates. Five vowels tend to be close together on the F1-F2 plane as the speaking rate becomes fast reflecting the neutralization of vowels. However, average difference of the third formant has been found to be very small.