ISCA Archive Interspeech 2011
ISCA Archive Interspeech 2011

Minimum classification error based spectro-temporal feature extraction for robust audio classification

Yuan-Fu Liao, Chia-Hsing Lin, We-Der Fang

Mel-frequency cepstral coefficients (MFCCs) are the most popular features for automatic audio classification (AAC). However, MFCCs are often not robust in adverse environment. In this paper, a minimum classification error (MCE)-based method is proposed to extract new and robust spectro-temporal features as alternatives to MFCCs. The robustness of the proposed new features is evaluated on noisy non-speech sound of RWCP Sound Scene Database in Real Acoustic Environment database with Aurora 2 multi-condition training task-like settings. Experimental results show the proposed new features achieved the lowest average recognition error rate of 3.17% which is much better than state-of-the-art MFCCs plus mean subtraction, variance normalization and ARMA filtering (MFCC+MVA, 4.31%), Gabor filters with principle component analysis (Gabor+PCA, 4.43%) and linear discriminant analysis (LDA, 4.20%) features. We thus confirm the robustness of the proposed spectro-temporal feature extraction approach.