This paper presents the virtual speech cuer built in the context of the ARTUS project aiming at watermarking hand and face gestures of a virtual animated agent in a broadcasted audiovisual sequence. For deaf televiewers that master cued speech, the animated agent can be then incrusted - on demand and at the reception - in the original broadcast as an alternative to subtitling. The paper presents the multimodal textto- speech synthesis system and the first evaluation performed by deaf users.