ISCA Archive Interspeech 2012
ISCA Archive Interspeech 2012

A full-band adaptive harmonic representation of speech

Gilles Degottex, Yannis Stylianou

In this paper we present a full-band Adaptive Harmonic Model (aHM) that is able to accurately reconstruct stationary and non stationary parts of speech. The model does not require any voiced/unvoiced decision, neither an accurate estimation of the pitch contour. Its robustness is based on a previously suggested adaptive Quasi-Harmonic model (aQHM) which provides a mechanism for frequency correction and adaptivity of its basis functions to the characteristics of the input signal. The suggested method overcomes limitations of the initial method based on aQHM in detecting frequency tracks over time, especially at mid and high frequencies, by employing a bandlimited iterative procedure for the re-estimation of the fundamental frequency. Listening tests show that reconstructed speech using aHM is mainly indistinguishable from the original signal, outperforming standard sinusoidal models (SM) and the aQHM-based method, while it uses less parameters for the reconstruction than SM.

Index Terms: Sinusoidal model, quasi-harmonic model, nonstationary basis, speech analysis.


doi: 10.21437/Interspeech.2012-138

Cite as: Degottex, G., Stylianou, Y. (2012) A full-band adaptive harmonic representation of speech. Proc. Interspeech 2012, 382-385, doi: 10.21437/Interspeech.2012-138

@inproceedings{degottex12_interspeech,
  author={Gilles Degottex and Yannis Stylianou},
  title={{A full-band adaptive harmonic representation of speech}},
  year=2012,
  booktitle={Proc. Interspeech 2012},
  pages={382--385},
  doi={10.21437/Interspeech.2012-138},
  issn={2958-1796}
}