In this paper we try to improve the human-machine interaction of a voice-activated system by adding prosodic characteristics to the system. We focus on verbal hesitation, which is manifested by speech disfluencies. In human-human communication recent research shows that moderate disfluencies make speakers more credible. In addition, people tend to react more leniently to an erroneous answer, if the answer was given by the conversant in a hesitating manner, implying that the responding person is unsure of the correct answer. In this study we investigate the hypothesis that users will react in a similar way to voice activated systems. Specifically, we hypothesized that adding prosodic features to the system’s speech responses, will increase the user’s perception of the system credibility, his/her overall satisfaction and reduce frustration while using the system.
Index Terms. Multimodal Interaction; Human-Machine Interaction; prosody; speech recognition