ISCA Archive SLaTE 2007
ISCA Archive SLaTE 2007

Using visual speech for training Chinese pronunciation: an in-vivo experiment

Ying Liu, Dominic W. Massaro, Trevor H. Chen, Derek Chan, Charles Perfetti

Recent research showed that our perception and understanding of speech are influenced by a speaker's facial expressions and accompanying gestures, as well as the actual sound of the speech. Perceivers expertly use these multiple sources of information to identify and interpret the language input. BaldiĀ® is a three-dimensional animated talking head appropriately aligned with either synthesized or natural speech. The present in-vivo experiment used Bao, a Chinese version of Baldi, to teach Chinese syllables to adult English native speakers. The result showed that students trained with Baldi, improved more than students trained with ordinary speech. Advantages of the Baldi pedagogy and technology include the popularity and proven effectiveness of computers and embodied conversational agents, the perpetual availability of the program, and individualized instruction. The technological edge of Baldi holds great promise in language learning, dialog, human-machine interaction, education, and edutainment.