ISCA Archive Interspeech 2012
ISCA Archive Interspeech 2012

Microphone array post-filter based on spatially-correlated noise measurements for distant speech recognition

Kenichi Kumatani, Bhiksha Raj, Rita Singh, John McDonough

This paper presents a new microphone-array post-filtering algorithm for distant speech recognition (DSR). Conventionally, post-filtering methods assume static noise field models, and using this assumption, employ a Wiener filter mechanism for estimating the noise parameters. In contrast to this, we show how we can build the Wiener post-filter based on actual noise observations without any noise-field assumption. The algorithm is framed within a state-of-the-art beamforming technique, namely maximum negentropy (MN) beamforming with super directivity. We investigate the effectiveness of the proposed post-filter on DSR through experiments on noisy data collected in a car under different acoustic conditions. Experiments show that the new post-filtering mechanism is able to achieve up to 20% relative reduction of word error rates (WER) under the represented noise conditions, as compared to a single distant microphone. In contrast, super-directive (SD) beamforming followed by Zelinski post-filtering achieves a relative WER reduction of only up to 11%. Other post-filters evaluated perform similarly in comparison to the proposed post-filter.

Index Terms: Microphone array, Post-filter, Distant speech recognition, Automotive speech application


doi: 10.21437/Interspeech.2012-107

Cite as: Kumatani, K., Raj, B., Singh, R., McDonough, J. (2012) Microphone array post-filter based on spatially-correlated noise measurements for distant speech recognition. Proc. Interspeech 2012, 298-301, doi: 10.21437/Interspeech.2012-107

@inproceedings{kumatani12_interspeech,
  author={Kenichi Kumatani and Bhiksha Raj and Rita Singh and John McDonough},
  title={{Microphone array post-filter based on spatially-correlated noise measurements for distant speech recognition}},
  year=2012,
  booktitle={Proc. Interspeech 2012},
  pages={298--301},
  doi={10.21437/Interspeech.2012-107},
  issn={2958-1796}
}