ISCA Archive Interspeech 2007
ISCA Archive Interspeech 2007

Prosody change and response timing analysis in spontaneously spoken dialogs and their modeling in a spoken dialog system

Ryota Nishimura, Norihide Kitaoka, Seiichi Nakagawa

If a dialog system were to respond to a user as naturally as a human, interaction would be smoother. Imitating the human prosodic behavior of utterances is important in computer-human natural conversations. In this paper, to develop a cooperative/ friendly spoken dialog system, we analyzed the correlations between F0 synchrony tendency or overlap frequency and subjective measures: "liveliness," "familiarity," and "informality" in human-human dialogs. We also modeled the properties of these features and implemented the model on our dialog system that generated the response timing of aizuchi (back-channel), turn-taking based on a decision tree in real time, and dynamical F0 changes to realize chat-like conversations.