In this paper we propose an efficient method to utilize context in the output densities of HMMs. State scores of a phone recognizer are integrated into the HMMs of a word recognizer which makes their output densities context-dependent. A significant reduction of the word error rate has been achieved when the approach is evaluated on a set of spontaneous speech utterances. As we can expect that context is more important for some phone models than for others, we further extend the approach by state-dependent weighting factors which are used to control the influence of the different information sources. A small additional improvement has been achieved.