ISCA Archive Interspeech 2006
ISCA Archive Interspeech 2006

A probabilistic graphical model for microphone array source separation using rich pre-trained source models

H. T. Attias

Voice based computing applications, such as phone communication and speech recognition, use microphone arrays to capture voice from a human speaker. In many environments of interest, however, sounds from other sources interfere with the speaker’s voice, posing severe problems for subsequent processing. This paper describes a new framework for treating this problem, and presents and demonstrates a new algorithm for the cancellation of interfering sounds. Our framework combines techniques from statistical machine learning with ideas from speech and audio processing. An important feature involves training rich probabilistic models on data from different types of relevant sound sources. Those source models are then incorporated into a larger probabilistic model of the observed microphone data. Using that model we derive our algorithm, which is of the expectationmaximization type and infers from data the clean sound of separate individual sources. We report very good results on data recorded in different environments.