ISCA Archive SpeechProsody 2002
ISCA Archive SpeechProsody 2002

Generation of emotions by a morphing technique in English, French and Spanish

Philippe Boula de Mareüil, Philippe Célérier, Jacques Toen

Generating variants becomes a priority for text-to-speech (TTS) synthesis. In particular, additional mark-ups inserted within the text may be used to communicate emotions. Within the framework of a European project linked to the MPEG4 standard (INTERFACE),our purpose is the synthesis of six emotions (anger, disgust, fear, joy, surprise and sadness): this was performed by applying a morphing technique, from the sequence of phonemes and their corresponding prosodic characteristics, for a "neutral" style, generated by a multilingual TTS system. We dispose of corpora declined under these six emotions by professional actors in English, French and Spanish: some trends may be drawn, as the inversion of fundamental frequency slopes for disgust and the pruning of melodic movements for sadness. We tend to think that the perceptual identification of the different emotions will be facilitated, within the framework of MPEG4, by the addition of a visual component: a talking head.