ISCA Archive Eurospeech 2001
ISCA Archive Eurospeech 2001

A boosting approach for confidence scoring

Pedro J. Moreno, Beth Logan, Bhiksha Raj

In this paper we present the application of a boosting classification algorithm to confidence scoring. We derive feature vectors from speech recognition lattices and feed them into a boosting classifier. This classifier combines hundreds of very simple `weak learners' and derives classification rules that can reduce the confidence error rate by up to 34%. We compare our results to those obtained using two other standard classification techniques, Support Vector Machines (SVMs) and Classification and Regression Trees (CART), and show significant improvements. Furthermore, the nature of the boosting algorithm allows us to combine the best single classifier and improve its performance. We present experimental results on real world corpora derived from our SpeechBot Web index http://www.speechbot.com) and from the HUB4 DARPA evaluation sets. We believe these results have wide applicability to audio indexing and to acoustic and language modeling adaptation where word confidence scores can be used in iterative adaptation schemes.