ISCA Archive SpeechProsody 2008
ISCA Archive SpeechProsody 2008

A new clustering approach for JEMA

Pablo Daniel Agüero, Juan Carlos Tulli, Antonio Bonafonte

This paper focuses on the training process of intonation models for text-to-speech synthesis. In previous papers we concentrated on two key points of intonation modelling: interpolation of fundamental frequency contour in unvoiced segments and sentence-by-sentence parameter extraction. We proposed an alternative approach for model training named JEMA (Joint Extraction and Modeling Approach) using CART. Here we propose a new alternative to obtain the mapping function that relates the linguistic features available in TTS and the fundamental frequency contour space. A clustering algorithm using a distance measure over a variable feature vector dimension space is used to partition the space of fundamental frequency contours in the training data. In this way we seek for important groups of features with specific values that explain the shape of fundamental frequency contours. The proposed technique shows improvements in the experimental results over CART.

doi: 10.21437/SpeechProsody.2008-19

Cite as: Agüero, P.D., Tulli, J.C., Bonafonte, A. (2008) A new clustering approach for JEMA. Proc. Speech Prosody 2008, 83-86, doi: 10.21437/SpeechProsody.2008-19

  author={Pablo Daniel Agüero and Juan Carlos Tulli and Antonio Bonafonte},
  title={{A new clustering approach for JEMA}},
  booktitle={Proc. Speech Prosody 2008},