ISCA Archive Eurospeech 1999
ISCA Archive Eurospeech 1999

Neural network based optimal feature extraction for ASR

Narada D. Warakagoda, Magne H. Johnsen

The procedure of calculating Mel Frequency based Cepstral Coefficients (MFCC) is shown to resemble a three layer Multilayer Perceptron (MLP) like structure. Such an MLP is employed as a preprocessor in a hybrid HMM-MLP system, and the possibility of optimizing the whole system as a single entity, with respect to a suitable criterion, is pointed out. This system, to-gether with the Maximum Mutual Information (MMI) criterion was tested on a speaker independent, five broad class, isolated phoneme recognition task. Results of these preliminary experi-ments, which clearly indicate the advantage of optimizable pre-processing, are reported.