ISCA Archive Eurospeech 1997
ISCA Archive Eurospeech 1997

Use of pitch pattern improvement in the CHATR speech synthesis system

Ken Fujisawa, Toshio Hirai, Norio Higuchi

A corpus-based concatenative speech synthesis system using no signal processing can produce intelligible synthetic speech maintaining original voice characteristics, but it can sometimes be difficult to realize natural prosody. In such a concatenative system, it is very important to select appropriate waveform segments that are naturally close to the target prosody. This paper describes some approaches to unit selection for improving the prosody, especially intonation of such synthetic speech. If the unit selection measures for the fundamental frequency (F0) are insuficient, the concatenative system may produce speech having a discontinuous F0 pattern. Our proposed solution to this problem is to add extra measures for selecting units that form a smoother, more continuous F0 contour. Through subjective experiments, we confirmed that each of these measures effectively improved intonation naturalness.


doi: 10.21437/Eurospeech.1997-674

Cite as: Fujisawa, K., Hirai, T., Higuchi, N. (1997) Use of pitch pattern improvement in the CHATR speech synthesis system. Proc. 5th European Conference on Speech Communication and Technology (Eurospeech 1997), 2671-2674, doi: 10.21437/Eurospeech.1997-674

@inproceedings{fujisawa97_eurospeech,
  author={Ken Fujisawa and Toshio Hirai and Norio Higuchi},
  title={{Use of pitch pattern improvement in the CHATR speech synthesis system}},
  year=1997,
  booktitle={Proc. 5th European Conference on Speech Communication and Technology (Eurospeech 1997)},
  pages={2671--2674},
  doi={10.21437/Eurospeech.1997-674},
  issn={1018-4074}
}