ISCA Archive Interspeech 2011
ISCA Archive Interspeech 2011

Mapping sparse representation to state likelihoods in noise-robust automatic speech recognition

Katariina Mahkonen, Antti Hurmalainen, Tuomas Virtanen, Jort F. Gemmeke

This paper proposes learning-based methods for mapping a sparse representation of noisy speech to state likelihoods in an automatic speech recognition system. We represent speech as a sparse linear combination of exemplars extracted from training data. The weights of exemplars are mapped to speech state likelihoods using Ordinary Least Squares (OLS) and Partial Least Squares (PLS) regression. Recognition experiments are conducted using the CHiME noisy speech database. According to the results, both algorithms can be successfully used for training the mapping. We achieve improvements over the previous binary labeling system, and recognition scores close to 70% at -6 dB SNR.