ISCA Archive Interspeech 2010
ISCA Archive Interspeech 2010

Efficient estimation of maximum entropy language models with n-gram features: an SRILM extension

Tanel Alumäe, Mikko Kurimo

We present an extension to the SRILM toolkit for training maximum entropy language models with N-gram features. The extension uses a hierarchical parameter estimation procedure for making the training time and memory consumption feasible for moderately large training data (hundreds of millions of words). Experiments on two speech recognition tasks indicate that the models trained with our implementation perform equally to or better than N-gram models built with interpolated Kneser-Ney discounting.