ISCA Archive Eurospeech 1997
ISCA Archive Eurospeech 1997

Multiresolution channel normalization for ASR in reverberant environments

Carlos Avendano, Sangita Tibrewala, Hynek Hermansky

To overcome the problems related with the long impulse responses produced by reverberation, we use a long time window (high frequency resolution) analysis during the channel normalization steps of the feature extraction process in automatic speech recognition (ASR). After normalization, a trade between frequency and time resolution is used to increase the rate at which the time information is sampled (short-time domain), yielding an appropriate domain to derive ASR features. Experiments on data with reverberation times of about 0.5 s show that the new technique achieves significant performance improvement of a speech recognizer under reverberation, with only some performance degradation on clean speech.


doi: 10.21437/Eurospeech.1997-111

Cite as: Avendano, C., Tibrewala, S., Hermansky, H. (1997) Multiresolution channel normalization for ASR in reverberant environments. Proc. 5th European Conference on Speech Communication and Technology (Eurospeech 1997), 1107-1110, doi: 10.21437/Eurospeech.1997-111

@inproceedings{avendano97_eurospeech,
  author={Carlos Avendano and Sangita Tibrewala and Hynek Hermansky},
  title={{Multiresolution channel normalization for ASR in reverberant environments}},
  year=1997,
  booktitle={Proc. 5th European Conference on Speech Communication and Technology (Eurospeech 1997)},
  pages={1107--1110},
  doi={10.21437/Eurospeech.1997-111},
  issn={1018-4074}
}