ISCA Archive Blizzard 2010
ISCA Archive Blizzard 2010

The GlottHMM Speech Synthesis Entry for Blizzard Challenge 2010

Antti Suni, Tuomo Raitio, Martti Vainio, Paavo Alku

This paper describes the GlottHMM speech synthesis entry for Blizzard Challenge 2010. GlottHMM is a hidden Markov model (HMM) based speech synthesis system that utilizes glottal inverse filtering for separating the vocal tract from the glottal source. The source and the filter characteristics are modeled separately in the framework of HMM. In the synthesis stage, natural glottal flow pulses are used to generate the excitation signal, and the excitation signal is further modified according to the desired voice source characteristics generated by the HMM. In order to prevent the over-smoothing of the vocal tract filter parameters, a new formant enhancement method is used to make the vocal tract resonances sharper. Finally, speech is synthesized by filtering the glottal excitation by the vocal tract filter.