ISCA Archive Eurospeech 1999
ISCA Archive Eurospeech 1999

Audio-visual synthesis of talking faces from speech production correlates

Takaaki Kuratate, Kevin G. Munhall, Philip E. Rubin, Eric Vatikiotis-Bateson, Hani Yehia

This paper presents technical refinements and extensions of our system for correlating audible and visible components of speech behavior and subsequently using those correlates to generate realistic talking faces. Introduction of nonlinear estimation techniques has improved our ability to generate facial motion either from the speech acoustics or from orofacial muscle EMG. Also, preliminary evidence is given for the strong correlation found 3D head motion and fundamental frequency (F0). Coupled with improved methods for deriving facial d e-formation parameters from static 3D face scans, more realistic talking faces are now being synth e-sized.