In this paper we present an alternative to hidden Markov models for the recognition of image sequences. The approach is based on a stochastic version of recurrent neural networks, which we call diffusion networks. Contrary to hidden Markov models, diffusion networks operate with continuous state dynamics, and generate continuous paths. This aspect that may be beneficial in computer vision tasks in which continuity is a useful constraint. In this paper we review results required for the implementation of diffusion networks, and then apply them to a visual speech recognition task. Diffusion networks outperformed the results obtained with the best hidden Markov models.