ISCA Archive Interspeech 2011
ISCA Archive Interspeech 2011

A multichannel feature-based processing for robust speech recognition

Mehrez Souden, Keisuke Kinoshita, Marc Delcroix, Tomohiro Nakatani

We propose a new approach for multichannel robust speech recognition. This approach extends the vector Taylor series (VTS)-based feature compensation from the single channel to the multichannel case. Precisely, we use the first order VTS to approximate each of the microphone feature vectors. Afterwards, these features are jointly processed to estimate the acoustic channel and noise statistics via expectation maximization (EM). Experimental results with TI-Digits and measured impulse responses show that the proposed method can achieve significant gains in terms of word recognition accuracy in different noise conditions.