ISCA Archive Interspeech 2012
ISCA Archive Interspeech 2012

A comparative study of adaptive, automatic recognition of disordered speech

Heidi Christensen, Stuart Cunningham, Charles Fox, Phil Green, Thomas Hain

Speech-driven assistive technology can be an attractive alternative to conventional interfaces for people with physical disabilities. However, often the lack of motor-control of the speech articulators results in disordered speech, as condition known as dysarthria. Dysarthric speakers can generally not obtain satisfactory performances with off-the-shelf automatic speech recognition (ASR) products and disordered speech ASR is an increasingly active research area. Sparseness of suitable data is a big challenge. The experiments described here use UAspeech, one of the largest dysarthric databases available, which is still easily an order of magnitude smaller than typical speech databases. This study investigates how far state-of-the-art training and adaptation techniques developed in the LVCSR community can take us. A variety of ASR systems using maximum likelihood and MAP adaptation strategies are established with all speakers obtaining significant improvements compared to the baseline system regardless of the severity of their condition. The best systems show on average 34% relative improvement on known published results. An analysis of the correlation between intelligibility of the speaker and the type of system which would represent an optimal operating point in terms of performance shows that for less severely dysarthric speakers, there is a wider choice of "best" system.

Index Terms: dysarthric speech, speech recognition, speaker adaptation