ISCA Archive Interspeech 2012
ISCA Archive Interspeech 2012

Improved model selection for the ASR-driven binary mask

William Hartmann, Eric Fosler-Lussier

In a previous study, we proposed an alternative masking criterion for binary mask estimation based on the underlying linguistic information. We estimated this mask by selecting from a set of candidate masks at each frame based on the hypotheses from an ASR system. Our previous system provided an 8% reduction in WER. In this work, we present an improved method for selecting the correct candidate mask at each frame, increasing the reduction in WER to 14%. Our new method uses a discriminative sequence model and provides a framework that can incorporate other mask estimations as features.

Index Terms: speech recognition, binary mask estimation