We present here the main approaches used to synthesize and drive talking faces. Illustrative systems are described. We distinguish between facial synthesis itself (i.e the manner in which facial movements are rendered on a computer screen), and the way these movements may be controlled and predicted using phonetic input. We then focus on the necessity to capture, model and render with maximum fidelity the intimate coherence of the facial deformations observed on a human face.