ISCA Archive Eurospeech 1997
ISCA Archive Eurospeech 1997

On the global FO shape model using a transition network for Japanese text-to-speech systems

Yasushi Ishikawa, Takashi Ebihara

In this paper, we describe a model of fundamental frequency control. In general, a two stage model which consists of a global model and a local model is used as a FO control method for Japanese text-to-speech systems. We propose a model which is represented by transition network as a global model that generates parameters of a local pitch model from linguistic parameters of a sentence. In the proposed model, syntactic analysis and generation of FO parameters are integrated, and the nodes of a network represent the linguistic and prosodic state of a sentence. The parameters of a local model is generated when taking transition. We also propose a training method of the network. The prediction results showed our model can predict the phrasal accent parameters with satisfactory high accuracy. We also describe the model can be applied prediction of pause position.


doi: 10.21437/Eurospeech.1997-676

Cite as: Ishikawa, Y., Ebihara, T. (1997) On the global FO shape model using a transition network for Japanese text-to-speech systems. Proc. 5th European Conference on Speech Communication and Technology (Eurospeech 1997), 2679-2682, doi: 10.21437/Eurospeech.1997-676

@inproceedings{ishikawa97_eurospeech,
  author={Yasushi Ishikawa and Takashi Ebihara},
  title={{On the global FO shape model using a transition network for Japanese text-to-speech systems}},
  year=1997,
  booktitle={Proc. 5th European Conference on Speech Communication and Technology (Eurospeech 1997)},
  pages={2679--2682},
  doi={10.21437/Eurospeech.1997-676},
  issn={1018-4074}
}