ISCA Archive SpeechProsody 2006
ISCA Archive SpeechProsody 2006

Automatic pitch stylization enhanced by top-down processing

Mikolaj Wypych

In the paper an original method of pitch stylization from the speech waveform and its orthographic transcript is presented. In addition to bottom-up data processing, a top-down step is employed. The top-down step allows for the reduction of contextual variability of intonational structure constituents. Software implementation of the stylization method for the Polish language is described. The design takes advantage of components borrowed from an existing automatic intonation recognizer. Fundamental frequency extraction in the design is performed using a comb filter. In a subsequent stage, a syllablewise pitch stylization is performed, followed by contextual pitch tracking. Intonational structure is recognized by an intonational parser based on Hidden Markov Models. The intonation model conveying an annotation system is taken from the recent intonation grammar for Polish by Jassem. Components of the design were developed in parallel which allowed for the coordination of tradeoffs between the modules. Training set and exemplary results are presented together with a discussion of future improvements.