ISCA Archive Eurospeech 1997
ISCA Archive Eurospeech 1997

Blame assignment for errors made by large vocabulary speech recognizers

Lin Chase

This paper describes an approach to identifying the reasons that speech recognition errors occur. The algorithm presented requires an accurate word transcript of the utterances being analyzed. It places errors into one of the categories: 1) due to out­of­ vocabulary (OOV) word spoken, 2) search error, 3) homophone substitution, 4) language model overwhelming correct acoustics, 5) transcript/pronunciation problems, 6) confused acoustic models, or 7) miscellaneous/not possible to categorize. Some categorizations of errors can supply training data to automatic corrective training methods that refine acoustic models. Other errors supply language model and lexicon designers with examples that identify potential improvements. The algorithm is described and results on the combined evaluation test sets from 1992­1995 of the North American Business (NAB) [1] [2] [3] corpus using the Sphinx­II recognizer [4] are presented.