This paper presents the evaluation of several techniques for channel compensation in a specific scenario of acoustic mismatch between training and test conditions, for a speaker-dependent speech recognizer in both DDHMM and CDHMM frameworks. Several methods, such as High Pass Filtering [7], Cepstral Mean Normalization, RATZ algorithms [2], Bayesian learning (Maximum a Posteriori estimation) [3, 5, 4] without stereo data were taken in account in order to analyze the problem of robustness to microphone variations in blind conditions. This evaluation has been performed in terms of performance improvement, amount of adaptation data and computational load.
The experiments were run on a speaker-dependent continuous speech recognition task with a test vocabulary size of 247 words. The relative error reduction in adapted cross conditions ranges from 53.4% with about 3 seconds of adaptation data to 71.8% with 5 minutes in a CDHMM framework using the best algorithm for each situation.