ISCA Archive AVIOS 2012
ISCA Archive AVIOS 2012

Calling um... john or calling john! - the perceptual effect of prosody in voice-activated system responses

Eran Aharonson, Vered Aharonson, Talya Porat, Vered Silber-Varod

In this paper we try to improve the human-machine interaction of a voice-activated system by adding prosodic characteristics to the system. We focus on verbal hesitation, which is manifested by speech disfluencies. In human-human communication recent research shows that moderate disfluencies make speakers more credible. In addition, people tend to react more leniently to an erroneous answer, if the answer was given by the conversant in a hesitating manner, implying that the responding person is unsure of the correct answer. In this study we investigate the hypothesis that users will react in a similar way to voice activated systems. Specifically, we hypothesized that adding prosodic features to the system’s speech responses, will increase the user’s perception of the system credibility, his/her overall satisfaction and reduce frustration while using the system.

Index Terms. Multimodal Interaction; Human-Machine Interaction; prosody; speech recognition