ISCA Archive Interspeech 2014
ISCA Archive Interspeech 2014

Robust speech recognition using temporal masking and thresholding algorithm

Chanwoo Kim, Kean K. Chin, Michiel Bacchiani, Richard M. Stern

In this paper, we present a new dereverberation algorithm called Temporal Masking and Thresholding (TMT) to enhance the temporal spectra of spectral features for robust speech recognition in reverberant environments. This algorithm is motivated by the precedence effect and temporal masking of human auditory perception. This work is an improvement of our previous dereverberation work called Suppression of Slowly-varying components and the falling edge of the power envelope (SSF). The TMT algorithm uses a different mathematical model to characterize temporal masking and thresholding compared to the model that had been used to characterize the SSF algorithm. Specifically, the nonlinear highpass filtering used in the SSF algorithm has been replaced by a masking mechanism based on a combination of peak detection and dynamic thresholding. Speech recognition results show that the TMT algorithm provides superior recognition accuracy compared to other algorithms such as LTLSS, VTS, or SSF in reverberant environments.

doi: 10.21437/Interspeech.2014-157

Cite as: Kim, C., Chin, K.K., Bacchiani, M., Stern, R.M. (2014) Robust speech recognition using temporal masking and thresholding algorithm. Proc. Interspeech 2014, 2734-2738, doi: 10.21437/Interspeech.2014-157

  author={Chanwoo Kim and Kean K. Chin and Michiel Bacchiani and Richard M. Stern},
  title={{Robust speech recognition using temporal masking and thresholding algorithm}},
  booktitle={Proc. Interspeech 2014},