ISCA Archive Eurospeech 1997
ISCA Archive Eurospeech 1997

A probabilistic model of double-vowel segregation

Laurent Varin, Frédéric Berthommier

The decomposition principle was first proposed by Varga and Moore [] and applied to Automatic Speech Recognition (ASR) in noise. We show a new adaptation of this principle to model the schema-based streaming process which was inferred after psychoacoustical studies []. We address here the classical problem of double vowel segregation. The signal decomposition is allowed by an internal and statistical model of vowel spectra. We apply this decomposition model able to reconstruct the spectra of superimposed signals after identification of only the dominant or of both members of the pair. Three stages are invoked. The first one is a module performing identification when the input is a mixture of interfering signals. Prior identification of the dominant spectra prevents combinatorial reconstruction. The second step is an evaluation of the mixture coefficient also based on an internal representation of spectra. Finally, the reconstruction of spectra is probabilistic, by the way of likelihood maximisation. It uses labels and mixture coefficient. This is tested on a large database of synthetic vowels.