ISCA Archive Eurospeech 1997
ISCA Archive Eurospeech 1997

A non-iterative model-adaptive e-CMN/PMC approach for speech recognition in car environments

Makoto Shozakai, Satoshi Nakamura, Kiyohiro Shikano

This paper investigates the Cepstrum Mean Normalization(CMN) which has been widely acknowledged useful for compensation of multiplicative distortions. However, the performance of usual CMN is limited because the normalization by a single cepstrum mean vector is not enough to compensate many factors of multiplicative distortion in real environments. To solve this problem, a new method E-CMN is proposed. The method estimates two cepstrum mean vectors, one for speech and the other for non-speech for each speaker and subtracts them from an input cepstrum This method is capable of compensating various kinds of multiplicative distortion collectively to normalize input spectra. Furthermore, a new model-adaptive approach E-CMN/PMC, based on E- CMN and HMM composition, is proposed for environments with additive noise and multiplicative distortions. This method is simplified in a sense that it is possible to add speech models and an additive noise model without any iterative operations. Matching gains for all frequency bands of speech models to the noise model are uniquely estimated as a cepstrum mean vector for speech. The performance of E-CMN/PMC in adverse car environsnents is finally evaluated.