ISCA Archive Interspeech 2006
ISCA Archive Interspeech 2006

Voice activity detector based on enhanced cumulant of LPC residual and on-line EM algorithm

David Cournapeau, Tatsuya Kawahara, Kenji Mase, Tomoji Toriyama

This paper addresses the problem of segmenting audio data recorded with embedded devices for the purpose of intelligent sensing in the context of multi-modal interactions. We propose a real-time method for robust speech detection in natural, noisy environments. It is based on a fusion of high order statistics of the LPC residual and autocorrelation, and adopts an on-line version of Expectation Maximization algorithm for the classification. Experimental evaluations show that the proposed method provides better detection performance under different types of natural noises, working robustly against other voices in the context of multi-speaker interactive situations. As the proposed method is based on features which have a low computational cost, and has a small latency, it is suitable for real-time tracking applications.