Hands-free speech recognition is a very important issue for a natural human machine interface. The distant talking speech in real environments is distorted by noise and reverberation of the room. This paper introduces characteristics of the room acoustical distortion and their influences on speech recognition accuracy. Then the paper tries to give a prospect of the solution based on previous studies and our research efforts. Especially a microphone array based-method and a model adaptation method are discussed. The microphone array can reduce the influences of the acoustical distortion by beam-forming. On the other hand, the model adaptation method can estimate the acoustical transfer function and adapt the speech models against the distorted observation signals. Furthermore, this paper also addresses hands-free speech recognition by incorporating automatic lip reading.