ISCA Archive SSW 1990
ISCA Archive SSW 1990

Speech synthesis by optimum concatenation of phoneme segments

Tetsuya Nomura, Hideyuki Mizuno, Hirokazu Sato

To achieve a concatenation-type Japanese text-to-speech system, we propose two basic procedures. The first is the use of phoneme segments with multiple tri-phone labels as the fundamental synthesis units. The multiple tri-phone labels equivalently increases the variation of the synthesis units. The second is a segment concatenation procedure taking account of feature parameter continuity at the segment junctions. A distortion at segment junction is introduced, which indicates how well synthesis units are combined. Natural and distinct speech is produced by the proposed procedures.


Cite as: Nomura, T., Mizuno, H., Sato, H. (1990) Speech synthesis by optimum concatenation of phoneme segments. Proc. First ESCA Workshop on Speech Synthesis (SSW 1), 39-42

@inproceedings{nomura90_ssw,
  author={Tetsuya Nomura and Hideyuki Mizuno and Hirokazu Sato},
  title={{Speech synthesis by optimum concatenation of phoneme segments}},
  year=1990,
  booktitle={Proc. First ESCA Workshop on Speech Synthesis (SSW 1)},
  pages={39--42}
}