ISCA Archive AVSP 2010
ISCA Archive AVSP 2010

d-o-e-s-not-c-o-m-p-u-t-e: vowel hyperarticulation in speech to an auditory-visual avatar

Denis Burnham, Sebastian Joeffry, Lauren Rice

Humans use speech to convey information; attract attention; express affect, etc. Speech register research shows that humans are adept at fine-tuning components of their speech to accommodate the needs of their audience, suggesting that they have a model of others’ communication needs. However, when that audience is a computer rather than another human, such a model may be invalid and speech adaptations, Computer-Directed Speech, may be inappropriate. Here we examine humans’ speech to other humans or an auditoryvisual avatar before and after the computer makes a listening “error”. Vowel durations are found to be longer in Computerthan Human-Directed Speech (especially in speech repairs after computer errors), and there is greater vowel hyperarticulation in Computer- than Human-Directed Speech both before and after error correction. The results are discussed in terms of human-computer interaction (HCI), talking head applications and ASR systems.

Index Terms: computer-directed speech, speech repairs, vowel hyperarticulation, human-computer interaction.