ISCA Archive Interspeech 2012
ISCA Archive Interspeech 2012

Perceptual foundations for naturalistic variability in the prosody of synthetic speech

Nanette Veilleux, Jonathan Barnes, Alejna Brugos, Stefanie Shattuck-Hufnagel

Recent studies have shown that the Tonal Center of Gravity is a better classifier than F0 Turning Points for at least two contrastively timed pitch accents in American English intonation contours. Within this framework, a binary F0 weighting function derived from the F0 contour can be used instead of the natural F0 contour without a degradation in discrimination performance. This success has important implications for speech synthesis. Just as we can capture the functional equivalence of a multitude of auditorily distinct F0 contour shapes in terms of their mapping to a single parameter (the TCoG) via a set of binary weighting functions, this same mapping could be run in reverse as a source to generate natural-sounding variability in speech synthesis.

Index Terms: Tonal Center of Gravity, F0 alignment, pitch accent classification, prosody, speech synthesis