ISCA Archive Interspeech 2010
ISCA Archive Interspeech 2010

Perceptual wavelet decomposition for speech segmentation

Mariusz Ziółko, Jakub Gałka, Bartosz Ziółko, Tomasz Drwiȩga

A non-uniform speech segmentation method based on wavelet packet transform is used for the localisation of phoneme boundaries. Eleven subbands are chosen by applying the mean best basis algorithm. Perceptual scale is used for decomposition of speech via Meyer wavelet in the wavelet packet structure. A real valued vector representing the digital speech signal is decomposed into phone-like units by placing segment borders according to the result of the multiresolution analysis. The final decision on localisation of the boundaries is made by analysis of the energy flows among the decomposition levels.