ISCA Archive ICSLP 2002
ISCA Archive ICSLP 2002

Experiments on speaker-independent voice command recognition using in-vehicle hands free speech

Yifan Gong, Lorin Netsch

For speaker-independent voice command recognition, the acoustic models can be trained using either task-independent speech corpora or task-specific speech corpora. The first alternative could reuse available speech databases, without collecting task-specific speech data. On the other hand, it is expected to give lower recognition performance due to acoustic and vocabulary mismatches between the speech corpora and the task.

This paper addresses the performance difference that can be expected between the two alternatives for a hands-free command recognition task. The databases involved consist of a general American speech database recorded in an office, and a command database recorded in-vehicle hands-free using a distant talking microphone. The evaluation is performed with three parameters: speech model training technique (task-independent vs. task-specific), mismatch compensation techniques, and various driving conditions. The experiments show that model adaptation with task-specific speech data can reduce the word error rates resulting from models trained on task-independent speech data by more than two thirds for all driving conditions.