ISCA Archive Interspeech 2015
ISCA Archive Interspeech 2015

Robust sound event classification using LBP-HOG based bag-of-audio-words feature representation

Hyungjun Lim, Myung Jong Kim, Hoirin Kim

This paper addresses the problem of sound event classification, focusing on feature extraction methods which are robust in noisy environments. In real world, sound events can be easily exposed in a noisy situation causing corruption of distinctive temporal and spectral characteristics. Therefore, extracting robust features to represent these characteristics is important in achieving good classification performance. In this paper, we employ a combination of local binary pattern (LBP) and histogram of oriented gradient (HOG) which are motivated from image processing technique to capture local characteristics of a spectrogram image in the noisy sound events. Furthermore, a bag-of-audio-words (BoAW) method is also applied to the combination of LBP and HOG to capture global characteristics of the spectrogram image. The proposed method is evaluated on a database consisting hundreds of audio clips for two groups of sound events which are aimed at audio surveillance applications. Test sounds are classified at various noise conditions by using a support vector machine and the proposed method shows over 20% relative improvements in average compared to other conventional feature based BoAW methods.