ISCA Archive ICSLP 2002
ISCA Archive ICSLP 2002

Performance of discriminatively trained auditory features on Aurora2 and Aurora3

Brian Mak, Yik-Cheung Tam

The design of acoustic models involves two main tasks: feature extraction and data modeling; and hidden Markov modeling (HMM) is commonly used in contemporary automatic speech recognition. In the past, discriminative training has been applied successfully to re- fine HMM parameters that are initially trained by EM algorithm. Recently, we applied discriminative training in the feature extraction process. We proposed a novel Discriminative Auditory Feature extraction method (DAF) in which filters are discriminatively trained from data. In DAF, we do not make any assumptions on the functional form of the auditory filters except that they have to be smooth and triangular-like. On the method of discriminative training, we also proposed an alternative approach to finding the competing hypotheses which we call N-nearest hypotheses (as opposed to the traditional N-best hypotheses). By applying the two new ideas and the new robust auditory features proposed by Li et al. of Bell Labs, we reduce the overall word error rate (WER) by 30.27% over ICSLP2002 Aurora2 baseline on multi-condition training. Similarly, we obtain a relative WER reduction of 38.42% over ICSLP2002 Aurora3 baseline.