ISCA Archive Interspeech 2012
ISCA Archive Interspeech 2012

Estimation of the vocal tract shape of nasals using a Bayesian scheme

Christian H. Kasess, Wolfgang Kreuzer, Ewald Enzinger, Nadja Kerschhofer-Puhalo

For nasal stops and nasalized vowels, one-tube models offer only an inadequate representation. To model the spectral components of nasal speech signals, a minimum of two connected tubes is necessary. Typically, the estimation of branched-tube area functions is based on a polezero model. The present paper introduces a variational Bayesian scheme under Gaussian assumptions to estimate the tube areas directly from the log-spectrum of the speech signal. Probabilistic priors are used to enforce smoothness of the tubes. The method is tested on recorded tokens of /m/ from several speakers using different prior variances. Results show that mild smoothness assumptions yield the best results in terms of model error and marginal likelihood. Furthermore, while yielding comparable fits, the estimated reflection coefficients from the Bayesian scheme show less intra-subject variability between tokens than an unregularized non-linear solver.

Index Terms: vocal tract, estimation, nasal stops, Bayesian statistics

doi: 10.21437/Interspeech.2012-219

Cite as: Kasess, C.H., Kreuzer, W., Enzinger, E., Kerschhofer-Puhalo, N. (2012) Estimation of the vocal tract shape of nasals using a Bayesian scheme. Proc. Interspeech 2012, 699-702, doi: 10.21437/Interspeech.2012-219

  author={Christian H. Kasess and Wolfgang Kreuzer and Ewald Enzinger and Nadja Kerschhofer-Puhalo},
  title={{Estimation of the vocal tract shape of nasals using a Bayesian scheme}},
  booktitle={Proc. Interspeech 2012},