ISCA Archive SSW 2010
ISCA Archive SSW 2010

Comparison of formant enhancement methods for HMM-based speech synthesis

Tuomo Raitio, Antti Suni, Hannu Pulakka, Martti Vainio, Paavo Alku

Hidden Markov model (HMM) based speech synthesis has a tendency to over-smooth the spectral envelope of speech, which makes the speech sound muffled. One means to compensate for the over-smoothing is to enhance the formants of the spectral model. This paper compares the performance of different formant enhancement methods, and studies the enhancement of the formants prior to HMM training in order to preemptively compensate for the over-smoothing. A new method for enhancing the formants of an all-pole model is also introduced. Experiments indicate that the formant enhancement prior to HMM training improves the quality of synthetic speech by providing sharper formants, and the performance of the new formant enhancement method is similar to the existing method.

Index Terms: speech synthesis, hidden Markov model, oversmoothing, formant enhancement