In this paper we present and compare several confidence mea-sures for large vocabulary continuous speech recognition. We show that posterior word probabilities computed on word graphs and N-best lists clearly outperform non-probabilistic confidence measures, e.g. the acoustic stability and the hypothesis density. In addition, we prove that the estimation of posterior word prob-abilities on word graphs yields better results than their estimation on N-best lists and discuss both methods in detail. We present experimental results on three different corpora, the English NAB 94 20k development corpus, the German VERBMOBIL 96 evaluation corpus and a Dutch corpus, which has been recorded with a train timetable information system in the ARISE project.