ISCA Archive SpeechProsody 2008
ISCA Archive SpeechProsody 2008

Extracting voice quality contours using discrete hidden Markov models

Marko Lugger, Frank Stimm, Bin Yang

In this paper we present an approach of extracting voice quality contours from speech utterances. We apply the theory of hidden Markov models to voice quality classification. As in the case of automatic speech recognition, where the states of the model are interpreted as different phonemes, we interpret the states of our voice quality models as different phonation types. Since nonmodal voice quality is only selectively applied in natural speech, the task is to detect those regions within an utterance where these voice qualities were used by the human speech production. We realize that by building so called voice quality contours. Each segment of speech is associated by one discrete voice quality class defined by J. Laver. In this study we distinguish between modal, breathy, creaky, and rough voice.