The objective of bimodal (audio-video) synthesis of acoustic speech has been addressed through the use of Kohonen neural architectures encharged of associating acoustic input parameters (cepstrum coeffcients) to articulatory estimates. This association is done in real-time allowing the synchronized presentation of source acoustic speech together with coherent articulatory visualization. Different architectural solutions have been investigated and compared in terms of objective measures (estimation distortion) as well as of subjective evaluation (perception experiments).