ISCA Archive Eurospeech 1993
ISCA Archive Eurospeech 1993

Designing control rules for a serial pole-zero vocal tract model

J. Kerkhoff, Lou Boves

Our rule-based multi-lingual text-to-speech system uses a synthesizer based on the source-filter theory of speech production. The voice source and noise source are implemented in a conventional manner: the voice source is the mathematical function and the noise source is a random generator producing noise with a Gaussian amplitude distribution. The vocal tract filter is not conventional: we employ a pole-zero (ARMA) filter, implemented as a cascade of second order resonators and antiresonators. The development of effective rules to control the ARMA vocal tract proved to be more difficult than anticipated, mainly because the physical interpretation of the zeros may change abruptly. From a system control point of view it became clear that zeros and poles cannot be allowed to move independently. Moreover, it appeared that the behaviour of a time-varying ARMA system is sensitive to internal group delays that are immaterial in stationary systems. The paper explains the mathematics of the problem of simultaneous control of pole and zero parameters in an ARMA filter in detail. Next, the solution implemented in our rule-based text-to-speech system is described.

Keywords: text-to-speech, pole-zero synthesizer