ISCA Archive Blizzard 2010
ISCA Archive Blizzard 2010

The NTUT Blizzard Challenge 2010 Entry

Yuan-Fu Liao, Ming-Long Wu, Shao-He Lyu

This paper describes our HMM-based speech synthesis system (HTS) submitted to Blizzard Challenge 2010. Three Mandarin Chinese voices were built for two hub (MH1and MH2) and one spoke (MS1) tasks this year (the voice for MS2 is the same as MH1's one). According to the evaluation results, our system got in average 2 points for both mean opinion scores (MOS) and similarity tests for MH1, MH2 and MS1. Beside, for MH1, about 22% and 24% pinyin error rates (without (PER) and with tone (PTER), respectively) and 28% character error rate (CER) were achieved for intelligibility test. However, for speech in noise task, MH2, the performance of our system is not satisfied, especially in low signal-to-noise (SNR) case. In conclusion, these results indicate there is still a lot of room for improvement, especially for dealing with different speaking style (comparing with last year's data) and noise interference.