ISCA Archive Blizzard 2010
ISCA Archive Blizzard 2010

Overview of NIT HMM-based speech synthesis system for Blizzard Challenge 2010

Keiichiro Oura, Kei Hashimoto, Sayaka Shiota, Keiichi Tokuda

This paper describes a hidden Markov model (HMM)-based speech synthesis system developed for the Blizzard Challenge 2010. This system employs STRAIGHT vocoding, minimum generation error (MGE) training, minimum generation error linear regression (MGELR) based model adaptation, the Bayesian speech synthesis framework, and the parameter generation algorithm considering global variance. The real-time factor of the speech synthesis system is about 0.3, and its footprint is less than 25 MB. Subjective evaluation results show that the overall speech quality and intelligibility of the systems are better than most other system, especially when a well-labeled speech database can be used.