ISCA Archive Interspeech 2012
ISCA Archive Interspeech 2012

Q-Gaussian based spectral subtraction for robust speech recognition

Hilman F. Pardede, Koichi Shinoda, Koji Iwano

Spectral subtraction (SS) is derived using maximum likelihood estimation assuming both noise and speech follow Gaussian distributions and are independent from each other. Under this assumption, noisy speech, speech contaminated by noise, also follows a Gaussian distribution. However, it is well known that noisy speech observed in real situations often follows a heavy-tailed distribution, not a Gaussian distribution. In this paper, we introduce a q-Gaussian distribution in non-extensive statistics to represent the distribution of noisy speech and derive a new spectral subtraction method based on it. In our analysis, the q-Gaussian distribution fits the noisy speech distribution better than the Gaussian distribution does. Our speech recognition experiments showed that the proposed method, q-spectral subtraction (q-SS), outperformed the conventional SS method using the Aurora-2 database.

Index Terms: robust speech recognition, spectral subtraction, Gaussian distribution, q-Gaussian, maximum likelihood