ISCA Archive Eurospeech 2003
ISCA Archive Eurospeech 2003

Acoustic model selection and voice quality assessment for HMM-based Mandarin speech synthesis

Wentao Gu, Keikichi Hirose

This paper presents a preliminary study in implementing HMM-based Mandarin speech synthesis system, whose main advantage exists in generating various voices. A variety of acoustic unit representations for Mandarin are compared to select an optimal acoustic model set. Syllabic vs. sub-syllabic, context-independent vs. context-dependent, toneless vs. tonal, initial-final vs. preme-toneme models, and models with various numbers of states, are investigated respectively. To take the most advantage of HMM-based speech synthesis, some aspects affecting speaker adaptation quality, especially the selection of adaptation data size, are also studied.