ISCA Archive Interspeech 2007
ISCA Archive Interspeech 2007

Continuous-speech phone recognition from ultrasound and optical images of the tongue and lips

Thomas Hueber, Gérard Chollet, Bruce Denby, Gérard Dreyfus, Maureen Stone

The article describes a video-only speech recognition system for a "silent speech interface" application, using ultrasound and optical images of the voice organ. A one-hour audiovisual speech corpus was phonetically labeled using an automatic speech alignment procedure and robust visual feature extraction techniques. HMM-based stochastic models were estimated separately on the visual and acoustic corpus. The performance of the visual speech recognition system is compared to a traditional acoustic-based recognizer.