This paper describes an automatic process which build variable length compound words, without fixing their maximum length, according to their contexts observed in a training text. Four criteria have been studied : the bigram frequency, a normalized measured based on the mutual information and left and right conditional probabilities. This work has been performed with a database recorded at LIMSI and made of rail travel information requests. The corresponding language models have been evaluated in terms of perplexity and speech error recognition rates with the LIMSI speech recognizer, and compared with a baseline word bigram model. Best results are obtained when the model is built with words concatenated with the left conditional probability.