ISCA Archive MAVEBA 2009
ISCA Archive MAVEBA 2009

Indirect estimation of formant frequencies through mean spectral variance with application to automatic gender recognition

Unto K. Laine, O. J. Räsänen

A novel approach for estimation of speaker specific vocal tract properties is presented in this paper. Instead of using the well-known long-term average spectrum (LTAS) of speech, it is shown that the variance of the magnitude of the spectrum in each band is also suitable for estimation of formant frequencies. This representation, called mean spectral variance (MSV), is applied to an automatic gender classification task, where it is shown to achieve good classification accuracy in combination with the fundamental frequency of speech. The MSV is compared with LTAS and their similarities and differences are discussed.

Index Terms. Formant estimation, gender classification, long-term feature averaging