ISCA Archive ISCSLP 2006
ISCA Archive ISCSLP 2006

Speech Endpoint Detection Based on Sub-band Energy and Harmonic Structure of Voice

Yanmeng Guo, Qiang Fu, Yonghong Yan

This paper presents an algorithm of speech endpoint detection in noisy environments, especially those with non-stationary noise. The input signal is firstly decomposed into several sub-bands. In each sub-band, an energy sequence is tracked and analyzed separately to decide whether a temporal segment is stationary or not. An algorithm of voiced speech detection based on the harmonic structure of voice is brought forward, and it is applied in the non-stationary segment to check whether it contain speech or not. The endpoints of speech are finally determined according to the combination of energy detection and voice detection. Experiments in real noise environments show that the proposed approach is more reliable compared with some standard methods.