ISCA Archive Eurospeech 1997
ISCA Archive Eurospeech 1997

Confidence metrics based on n-gram language model backoff behaviors

C. Uhrik, W. Ward

We report results from using language model confidence measures based on the degree of backoff used in a trigram language model. Both utterance-level and word-level confidence metrics proved useful for a dialog manager to identify out-of-domain utterances. The metric assigns successively lower confidence as the language model estimate is backed off to a bigram or unigram. It also bases its estimates on sequences of backoff degree. Experimental results with utterances from the domain of medical records management showed that the distributions of the confidence metric for in-domain and out-of-domain utterances are separated. Use of the corresponding word-level confidence metric shows similar encouraging results.