ISCA Archive Eurospeech 1995
ISCA Archive Eurospeech 1995

Source separation by a functional model of amplitude demodulation

Frédéric Berthommier, Georg F. Meyer

The aim of this work is to separate complex sounds characterised by their fundamental frequency (F0) and their spectrum. We propose an elementary 'cocktail party' processor working with these two features. The model consists of a DFT based processing of the output of a gammatone filterbank channel by channel, having three stages : pitch estimation, recovery of source spectrum using amplitude modulation (AM) frequency, and pattern matching, using neural networks. The signal is demodulated by rectification, and the amplitude modulation frequency is given by evaluation of the Fourier transform module. We build a two-dimensional tonotopic/AMtopic map where the complex sound components are well resolved and we group them in order to recover separate spectra, using the sieve estimate of the fundamental frequency. Performances are shown for the vowel/masker and vowel/vowel recognition task. We show that it is a good alternative method to the autocorrelogram proposed by Assmann and Summerfield [1] for performing the periodicity analysis and the complex sound separation.