ISCA Archive Blizzard 2012
ISCA Archive Blizzard 2012

The GlottHMM Entry for Blizzard Challenge 2012: Hybrid Approach

Antti Suni, Tuomo Raitio, Martti Vainio, Paavo Alku

This paper describes the GlottHMM speech synthesis system for Blizzard Challenge 2012. The aim of the GlottHMM system is to combine high-quality vocoding and detailed prosody modeling in order to produce expressive, high quality, synthetic speech. GlottHMM is based on statistical parametric speech synthesis, but it uses a glottal flow pulse library for generating the excitation signal. Thus, it can be regarded as a hybrid system using the pulses as concatenative units that are selected according to the statistically generated voice source feature trajectories. This year’s speech material was challenging especially, but despite that we were able to achieve a clean, intelligible voice with decent above average prosody characteristics.