ISCA Archive Eurospeech 1995
ISCA Archive Eurospeech 1995

Extensions of absolute discounting for language modeling

M. Generet, Hermann Ney, F. Wessel

In this paper, we extend the absolute discounting technique along various directions. To estimate the backing-off distribution, we use ra-gram singletons, i.e. ra-grams that were seen exactly once in the training data. This method is applied in addition to the usual estimation of discounting parameters. The improvement in perplexity is typically between 8% and 12%. We also investigate a cache model. In experimental tests on a large text corpus, the cache model improved the perplexity by up to 28%. The experimental evaluations were carried out on a set of 38 million words from the Wall Street Journal task. We compare our results with the results reported by CMU.