A back-propagation network with recurrent connections can successfully model many aspects of human spoken word recognition [1] [2]. However, the network is unable to revise its decisions in the light of subsequent context. TRACE [3] , on the other hand, manages to deal appropriately with following context but only by using a highly implausible architecture that fails to account for some important experimental results. A new model is presented which combines the more desirable properties of these two models. In contrast to TRACE the model is entirely bottom-up and can readily perform simulations with vocabularies of tens of thousands of words.