ISCA Archive Interspeech 2006
ISCA Archive Interspeech 2006

An efficient segment-based speech compression technique for hand-held TTS systems

Chang-Heon Lee, Sung-Kyo Jung, Thomas Eriksson, Won-Suk Jun, Hong-Goo Kang

This paper proposes a novel segment-based speech coding algorithm to efficiently compress the database for concatenative text-to-speech (TTS) systems. To achieve a high compression ratio and meet the fundamental requirements of concatenative TTS synthesizers, i.e. partial segment decoding and random access capability, we adopt a modified analysis-by-synthesis scheme. The spectral coefficients are quantized by a length-based interpolation method and excitation signals are modeled with both non-predictive and predictive approaches. Considering that pitch pulse waveforms of a specific speaker show low intra-variation, the conventional adaptive codebook for pitch prediction is replaced by a speaker dependent pitch-pulse codebook. By applying the proposed algorithm to a hand-held Korean TTS system, we verify that the proposed coder provides a compression ratio of about 1/13, a low complexity of around 1.2 WMOPS, and random access capability.


doi: 10.21437/Interspeech.2006-60

Cite as: Lee, C.-H., Jung, S.-K., Eriksson, T., Jun, W.-S., Kang, H.-G. (2006) An efficient segment-based speech compression technique for hand-held TTS systems. Proc. Interspeech 2006, paper 1980-Mon1FoP.3, doi: 10.21437/Interspeech.2006-60

@inproceedings{lee06_interspeech,
  author={Chang-Heon Lee and Sung-Kyo Jung and Thomas Eriksson and Won-Suk Jun and Hong-Goo Kang},
  title={{An efficient segment-based speech compression technique for hand-held TTS systems}},
  year=2006,
  booktitle={Proc. Interspeech 2006},
  pages={paper 1980-Mon1FoP.3},
  doi={10.21437/Interspeech.2006-60},
  issn={2958-1796}
}