ISCA Archive Interspeech 2011
ISCA Archive Interspeech 2011

A long-term harmonic plus noise model for speech signals

Faten Ben Ali, Laurent Girin, Sonia Djaziri Larbi

The harmonic plus noise model (HNM) is widely used for spectral modeling of mixed harmonic/noise speech sounds. In this paper, we present an analysis/synthesis system based on a long-term two-band HNM. "Long-term" means that the time-trajectories of the HNM parameters are modeled using "smooth" (discrete cosine) functions depending on a small set of parameters. The goal is to capture and exploit the long-term correlation of spectral components on time segments of up to several hundreds of ms. The proposed long-term HNM enables joint compact representation of signals (thus a potential for low bit-rate coding) and easy signal transformation (e.g. time stretching) directly from the long-term parameters. Experiments show that it can be compared favourably with the short-term version in terms of parameter rates and signal quality.