ISCA Archive Eurospeech 1997
ISCA Archive Eurospeech 1997

Using an auditory model and leaky autocorrelators to tune in to speech

T. Andringa

This paper introduces a method to esitimate the spectrum of voiced speech in noise, based on an estimate of the fundamental frequency. The method uses the output of an auditory model that imitates the mechanics of the basilar membrane. The output of the segments of the model is used as an input to a set of leaky autocorrelator units (as simple neuron models) sensitive to a certain periodicity (delay). If a noisy vowel is presented to the system, the units sensitive to the fundamental period of that vowel respond most actively. The activity of the responding autocorrelator units as a function of segment number is a direct measure of the spectrum of the vowel. This technique is very robust and can, like humans, estimate the existence of a vowel in a SNR of -10 dB aperiodic speech-noise and formant frequencies in -3 to -6 dB. With this technique it is possible to split a mixture of sound sources in auditory entities (percepts) on the basis of pitch.