ISCA Archive Interspeech 2014
ISCA Archive Interspeech 2014

Enabling controllability for continuous expression space

Langzhou Chen, Norbert Braunschweiler

A continuous expression space assumes that each utterance contains individual expressions. Thus, it can be used to model detailed expression information in speech data. However, since an infinite number of different expressions can be contained in the continuous expression space, it is very difficult to manually label them. That means, these expressions are very hard to identify and to extract for synthesising expressive speech. A mechanism to control the continuous expression space is missing. In the discrete expression space though, only a few emotions are defined, thus users can easily choose from these emotions, but the range of expressivity is limited. This work proposes a method to automatically annotate expressions in the continuous expression space based on the cluster adaptive training (CAT) method. Using the proposed method, complex emotion information can be associated to the individual expressions in the continuous space. These emotion labels can be used as indexes of the expressions in the continuous space to enable users to select desired expressions at synthesis time, i.e. enable the controllability for the continuous expression space. Meanwhile, the rich expressive information in the continuous space is kept so that more expressive speech can be generated compared to the discrete space.

doi: 10.21437/Interspeech.2014-195

Cite as: Chen, L., Braunschweiler, N. (2014) Enabling controllability for continuous expression space. Proc. Interspeech 2014, 2912-2916, doi: 10.21437/Interspeech.2014-195

  author={Langzhou Chen and Norbert Braunschweiler},
  title={{Enabling controllability for continuous expression space}},
  booktitle={Proc. Interspeech 2014},