ISCA Archive Eurospeech 2001
ISCA Archive Eurospeech 2001

Statistical sound source identification in a real acoustic environment for robust speech recognition using a microphone array

Takanobu Nishiura, Satoshi Nakamura, Kiyohiro Shikano

It is very important for a hands-free speech interface to capture distant talking speech with high quality. A microphone array is an ideal candidate for this purpose. However, this approach requires localizing the target talker. To cope with this problem, we propose a new talker localization method consisting of two algorithms. One algorithm is for multiple sound source localization based on CSP (Cross-power Spectrum Phase) analysis. The other algorithm is for sound source identification among localized multiple sound sources towards talker localization. In this paper, we particularly focus on the latter statistical sound source identification among localized multiple sound sources with statistical speech and environmental sound models based on GMMs (Gaussian Mixture Models) and a microphone array towards talker localization.