ISCA Archive Interspeech 2012
ISCA Archive Interspeech 2012

The log-Gabor method: speech classification using spectrogram image analysis

Harm Buisman, Eric Postma

We explored the suitability of the log-Gabor method, a speech analysis method inspired by Ezzat, Bouvrie & Poggio (2007), for automatic classification of personality and likability traits in speech. The core idea underlying the log-Gabor method is to treat the spectrogram as an image of spectro-temporal information. The image is transformed into Gabor energy values using the two-dimensional logarithmic Gabor transform, which is a standard feature extraction method in visual texture analysis. The aggregated energy values are mapped onto classes by means of a support vector machine (SVM). The log-Gabor method performed above baseline on both the INTERSPEECH Personality and Likability Sub-Challenges: 74.2% on the Likability task (baseline 58.0%) and 78.1% on the Personality task (baseline 70.3%). These results lead us to conclude that the log-Gabor method is a feasible method for extracting perceptual cues from speech.

Index Terms: spectro-temporal analysis, spectrogram analysis, log Gabor filters, likability classification, personality classification, support vector machines