ISCA Archive SpeechProsody 2004
ISCA Archive SpeechProsody 2004

Analysis of segmental duration for Thai speech synthesis

Chatchawarn Hansakunbuntheung, Yoshinori Sagisaka

This paper presents a characteristic study of Thai segmental duration and adapts the analysis results to construct a Thai phone duration model for Thai speech synthesis. The study uses Hayashi's categorized linear regression model to analyze the effects of various factors including current phonemes themselves, surrounding phonemes, phone positions in word, phone positions in phrase, part-of-speeches and Thai tones. These factors have combined to form a Thai phone duration model. The model gives rather high correlation of 0.788. Thought, it has fairly high RMS error of 33.14 ms, a evaluation shows the high consistency of the model on unknown data.