ISCA Archive SpeechProsody 2004
ISCA Archive SpeechProsody 2004

Development of the F0 control model for singing-voices synthesis

Takeshi Saitou, Masashi Unoki, Masato Akagi

Fundamental frequency (F0) control models for singing voices are required to construct singing-voice synthesis systems that can generate natural singing-voices. This paper describes the development of an F0 control model for singing-voices synthesis. F0 fluctuations are revealed as characteristics that need to control the F0 contour of singing-voices by investigating how much they influence singing-voices perception through psycho-acoustical experiments. These fluctuations have wider dynamic range and more complicated changes rather than in speaking-voices. The F0 control model is developed so that it can control important F0 fluctuations for the purpose of singing-voice perception. The singing-voice synthesis method using the F0 control model is proposed to synthesize natural singing-voices. Results of these experiments show that the F0 fluctuations are significant factors for singing-voices perception; the F0 control model can generate F0 contours of singing-voices and can be applied to synthesize natural singing-voices.