ISCA Archive AVSP 1998
ISCA Archive AVSP 1998

Visual Speech Synthesis With Concatenative Speech

Asa Hallgren, Bertil Lyberg

Today synthetic speech is often based on concatenation of natural speech, i.e. units such as diphones or polyphones are taken from natural speech and are then put together to form any word or sentence. So far there have mainly been two ways of adding a visual modality to such a synthesis: Morphing between single images or concatenating video sequences. In this study, however, a new method is presented where recorded natural movements of points on the face are used to control an animated face.

Cite as: Hallgren, A., Lyberg, B. (1998) Visual Speech Synthesis With Concatenative Speech. Proc. Auditory-Visual Speech Processing, 181-184

  author={Asa Hallgren and Bertil Lyberg},
  title={{Visual Speech Synthesis With Concatenative Speech}},
  booktitle={Proc. Auditory-Visual Speech Processing},