In this paper we examine the construction of long-range language models using log-linear interpolation and how this can be achieved effectively. Particular attention is paid to the efficient computation of the normalisation in the models. Using the Penn Treebank for experiments we argue that the perplexity performance demonstrated recently in the literature using grammar-based approaches can actually be achieved with an appropriately smoothed 4-gram language model. Using such a model as the baseline, we demonstrate how further improvements can be obtained using log-linear interpolation to combine distance word and class models. We also examine the performance of similar model combinations for rescoring word lattices on a medium-sized vocabulary Wall Street Journal task.