This paper discusses tree and trellis-based models as alternatives to conventional n-gram models as the basis for language modelling in automatic speech recognition. The advantage of these models lies in their compactness and the manner in which extended context is used to enhance performance. The latter is confirmed by experiment using models trained and tested on subsets of the Lancaster-Oslo/Bergen corpus.