ISCA Archive Interspeech 2014
ISCA Archive Interspeech 2014

Evaluating robust features on deep neural networks for speech recognition in noisy and channel mismatched conditions

Vikramjit Mitra, Wen Wang, Horacio Franco, Yun Lei, Chris Bartels, Martin Graciarena

Deep Neural Network (DNN) based acoustic models have shown significant improvement over their Gaussian Mixture Model (GMM) counterparts in the last few years. While several studies exist that evaluate the performance of GMM systems under noisy and channel degraded conditions, noise robustness studies on DNN systems have been far fewer. In this work we present a study exploring both conventional DNNs and deep Convolutional Neural Networks (CNN) for noise- and channel-degraded speech recognition tasks using the Aurora4 dataset. We compare the baseline mel-filterbank energies with noise-robust features that we have proposed earlier and show that the use of robust features helps to improve the performance of DNNs or CNNs compared to mel-filterbank energies. We also show that vocal tract length normalization has a positive role in improving the performance of the robust acoustic features. Finally, we show that by combining multiple systems together we can achieve even further improvement in recognition accuracy.


doi: 10.21437/Interspeech.2014-224

Cite as: Mitra, V., Wang, W., Franco, H., Lei, Y., Bartels, C., Graciarena, M. (2014) Evaluating robust features on deep neural networks for speech recognition in noisy and channel mismatched conditions. Proc. Interspeech 2014, 895-899, doi: 10.21437/Interspeech.2014-224

@inproceedings{mitra14_interspeech,
  author={Vikramjit Mitra and Wen Wang and Horacio Franco and Yun Lei and Chris Bartels and Martin Graciarena},
  title={{Evaluating robust features on deep neural networks for speech recognition in noisy and channel mismatched conditions}},
  year=2014,
  booktitle={Proc. Interspeech 2014},
  pages={895--899},
  doi={10.21437/Interspeech.2014-224},
  issn={2308-457X}
}