ISCA Archive ICSLP 2002
ISCA Archive ICSLP 2002

Acoustic modeling of sentence stress using differential features between syllables for English rhythm learning system development

Nobuaki Minematsu, Satoshi Kobashikawa, Keikichi Hirose, Donna Erickson

This study proposes a new technique for acoustic modeling of stressed/ unstressed syllables in sentence utterances of American English. Here, relative differences of acoustic features between two consecutive syllables characterizing "stressed" or "unstressed" are introduced into HMM-based acoustic modeling. This is because syllables can be identified as stressed or unstressed only after comparing them with their neighboring syllables. For training syllable HMMs, speech samples were recorded by ourselves because we could not find any database which can be directly used for this modeling. The fourth author put multi-level stress marks (syllable magnitude) on individual syllables of a given sentence set, which was done according to guidelines for teaching English rhythm to non-native speakers of English, proposed and used in class by the fourth author. After the stress mark assignment, the sentences were uttered by her and recorded for the HMM-based modeling. Experiments showed that stress/unstress identification errors were reduced by about 25% in comparison to the modeling technique without the relative differences. With this new technique, an English sentence stress detector is being developed.