ISCA Archive ICSLP 2002
ISCA Archive ICSLP 2002

Design of an audio-visual speech corpus for the czech audio-visual speech synthesis

Milos Zelezný, Petr Císar, Zdenek Krnoul, Jan Novák

Our long-term goal is to design a system for the Czech visual synthesis, that means an animated synthetic face (often called talking head) imitating pronouncing of a speech by a human being. In this paper we present techniques used for acquiring data and building the audio-visual speech corpus, especially its visual part. This process involves the recording of stereoscopic video data and solving of related problems as synchronization. Apart from that, we present simple method of utilization of such corpus using stereo vision principles and modelling shape of the lips by simple triangular mesh.