ISCA Archive Interspeech 2010
ISCA Archive Interspeech 2010

Conversational spontaneous speech synthesis using average voice model

Tomoki Koriyama, Takashi Nose, Takao Kobayashi

This paper describes conversational spontaneous speech synthesis based on hidden Markov model (HMM). To reduce the amount of data required for model training, we utilize average-voice-based speech synthesis framework, which has been shown to be effective for synthesizing speech with arbitrary speaker's voice using a small amount of training data. We examine several kinds of average voice model using reading-style speech and/or conversational speech. We also examine an appropriate utterance unit for conversational speech synthesis. Experimental results show that the proposed two-stage model adaptation method improves the quality of synthetic conversational speech.