ISCA Archive Interspeech 2011
ISCA Archive Interspeech 2011

Hierarchical tandem features for ASR in Mandarin

Joel Pinto, Mathew Magimai-Doss, Hervé Bourlard

We apply multilayer perceptron (MLP) based hierarchical Tandem features to large vocabulary continuous speech recognition in Mandarin. Hierarchical Tandem features are estimated using a cascade of two MLP classifiers which are trained independently. The first classifier is trained on perceptual linear predictive coefficients with a 90 ms temporal context. The second classifier is trained using the phonetic class conditional probabilities estimated by the first MLP, but with a relatively longer temporal context of about 150 ms. Experiments on the Mandarin DARPA GALE eval06 dataset show significant reduction (7.6% relative) in character error rates by using hierarchical Tandem features over conventional Tandem features.