This paper presents an experiment of audio-visual voice command recognition from small vocabulary in simulated noisy conditions. Electric and electronics devices should be controlled by these voice commands mainly in homes of motor-handicapped people and visual part of speech could improve recognition rate if a noise is relatively strong. Therefore the main aims of this experiment were to find out how visual part of speech can improve resulting recognition rate and if it is possible to use successfully two-stream Hidden Markov Models (HMMs) for audio-visual voice command recognition.