ISCA Archive Interspeech 2007
ISCA Archive Interspeech 2007

A fine pitch model for speech

Jasha Droppo, Alex Acero

An accurate model for the structure of speech is essential to many speech processing applications, including speech enhancement, synthesis, recognition, and coding. This paper explores some deficiencies of standard harmonic methods of modeling voiced speech. In particular, they ignore the effect of fundamental frequency changing within an analysis frame, and the fact that the fundamental frequency is not a continuously varying parameter, but a side effect of a series of discrete events.

We present an alternative, time-series based framework for modeling the voicing structure of speech called the fine pitch model. By precisely modeling the voicing structure, it can more accurately account for the content in a voiced speech segment.