ISCA Archive Interspeech 2006
ISCA Archive Interspeech 2006

Low-complexity and efficient classification of voiced/unvoiced/silence for noisy environments

Tuan Van Pham, Gernot Kubin

This paper describes a low-complexity and efficient speech classifier for noisy environments. The proposed algorithm utilizes the advantage of time-scale analysis of the Wavelet decomposition to classify speech frames into voiced, unvoiced and silence classes. The classifier uses only one single multidimensional feature which is extracted from the Teager energy operator of the wavelet coefficients. The feature is enhanced and compared with quantile-based adaptive thresholds to detect phonetical classes. Furthermore, to save memory, the adaptive thresholds are replaced by a slope tracking method on the filtered feature. These algorithms are tested with the TIMIT database and additive white, car, factory noise, and compared with other methods to demonstrate their superior performance and robustness.