Linear transformations are proposed for transforming vectors of Language Model (LM) probabilities. A separate vector is considered for each word and the j-th element of a vector is the probability of observing the word in the context of its j-th history. If a good general LM is available, it is possible to cluster vectors into classes and to infer a transformation for each class. Probability distributions of words which are not observed or which are observed with a low frequency in the adaptation corpus can be obtained by transforming the distribution they have in the general model using the transformation of the cluster they belong to. Experimental results show that there is a interesting range in the size of the adaptation corpus in which perplexity of the adapted LM is lower than the perplexity of the LM whose probabilities are directly estimated from the adaptation data.