ISCA Archive SpeechProsody 2004
ISCA Archive SpeechProsody 2004

Automatic analysis and synthesis of fujisaki's intonation model for TTS

Pablo Daniel Agüero, Klaus Wimmer, Antonio Bonafonte

This paper deals with the automatic analysis and synthesis of intonation using Fujisaki's model. We propose an analysis method which imposes strong linguistic constraints. This method produces good representations of the F0 contour when compared to other current methods which do not impose such constrains. Furthermore, this option limits the variability and is more predictable so it is the best option for prediction (at least when accent commands are related to accent groups). Several prediction algorithms are evaluated. The results show that VCART (an extension of CART to predict vector values) gives the best performance when compared with standard CART or with neural networks. The paper also analyzes which features are more relevant to predict the parameters of Fujisaki's model.