ISCA Archive Interspeech 2006
ISCA Archive Interspeech 2006

Bayesian networks for phonetic classification using time-scale features

Franz Pernkopf, Tuan Van Pham

We present a phonetic classification approach based on Bayesian networks using time-scale features which are extracted from the discrete Wavelet transform. We apply Bayesian networks using discriminative and generative parameter and/or structure learning for classifying the speech frames into silence, voiced, unvoiced, mixed sounds, and two more categories, voiced closure and release of plosives. Gender dependent/independent experiments have been performed on the TIMIT database. The experiments show that (i) our time-scale features mostly outperform standard MFCC features, (ii) discriminative learning of Bayesian networks is superior to the generative approach.