ISCA Archive Eurospeech 1993
ISCA Archive Eurospeech 1993

High-quality synthesis of LPC speech using multiband excitation model

C. F. Chan

This paper describes a method to achieve high-quality synthesis of speech from data coded in LPC-10 format. The method utilizes the Multi-Band Excitation (MBE) model for speech generation. In order to drive the MBE synthesizer, a set of voiced/unvoiced (V/UV) decisions for the pitch harmonics has to be regenerated from the coded LPC data. In this paper, we introduce a training and classification technique to extract the correlation information related to the spectrum envelope and the excitation. It was found that the speech spectrum envelope and the excitation are highly correlated, and a footprint of the V/UV mixture function can be extracted during the training stage and then stored alongside with the spectrum envelope information. This V/UV information can later be used to estimate a set of V/UV decisions for MBE synthesis. Unlike the conventional LPC speech which sounds buzzyness due to a single V/UV switch for the whole spectrum, it was demonstrated that the speech generated by the new method sounds very natural and of high quality.

Keywords: Speech Synthesis, Multiband Excitation