ISCA Archive Interspeech 2012
ISCA Archive Interspeech 2012

A stochastic model of singing voice F0 contours for characterizing expressive dynamic components

Yasunori Ohishi, Hirokazu Kameoka, Daichi Mochihashi, Kunio Kashino

We present a novel stochastic model of singing voice fundamental frequency (F0) contours for characterizing expressive dynamic components, such as vibrato and portamento. Although dynamic components can be important features for any singing voice applications, modeling and extracting these components from a raw F0 contour have yet to be accomplished. Therefore, we describe a process for generating dynamic components explicitly and represent the process as a stochastic model. Then we develop an algorithm for estimating the model parameters based on statistical techniques. Experimental results show that our method successfully extracts the expressive components from raw F0 contours.

Index Terms: Singing voice, Fundamental frequency, Second-order linear system, Stochastic model