ISCA Archive Interspeech 2014
ISCA Archive Interspeech 2014

Towards the adaptation of prosodic models for expressive text-to-speech synthesis

Mathieu Avanzi, George Christodoulides, Damien Lolive, Elisabeth Delais-Roussarie, Nelly Barbot

This paper presents a preliminary study whose main aim is to characterize four distinct speaking styles according to a limited set of prosodic features, including the length of prosodic phrases (AP and IP), the distribution of stressed syllables, pitch register span, the duration of silent pauses, etc. The analysis was performed using semi-automatic procedures on a corpus consisting of 30 minutes of speech per style. The study focuses on four styles, all of which are “overtly addressed to a given audience”, but differ as to the nature of the audience (adults vs. children) and the desired impact of the address (“importance of being understood and convincing, or not”). Data analysis reveals that (a) dictation (addressed to children) and political speeches (addressed to adults) are different to the two other speaking styles (reading of novels and fairy tales) with respect to a specific set of prosodic cues; while (b) the speeches addressed to children differ from the ones addressed to adults, with respect to another set of prosodic cues (especially pitch register span). These results have an interesting practical application: refining the design of pre-processing prosodic modules in a text-to-speech system, in order to improve the expressivity of synthesized speech.

doi: 10.21437/Interspeech.2014-409

Cite as: Avanzi, M., Christodoulides, G., Lolive, D., Delais-Roussarie, E., Barbot, N. (2014) Towards the adaptation of prosodic models for expressive text-to-speech synthesis. Proc. Interspeech 2014, 1796-1800, doi: 10.21437/Interspeech.2014-409

  author={Mathieu Avanzi and George Christodoulides and Damien Lolive and Elisabeth Delais-Roussarie and Nelly Barbot},
  title={{Towards the adaptation of prosodic models for expressive text-to-speech synthesis}},
  booktitle={Proc. Interspeech 2014},