Automatic speech recognition systems based on subword units (such as phonemes) can be enhanced by the use of context-specific modelling. This has been applied successfully in top-down recognition systems, in which strong lexical and syntactic constraints limit the number of context-specific units to be modelled. This paper describes a method for applying context-specific modelling in a modular system in which the acoustic-phonetic front end operates independently of vocabulary and syntax. Such a modular system has certain advantages as a research tool, particularly when combined with an entropy measure for evaluation of phoneme lattices. A technique for robust modelling of context-specific units, by interpolation of general and specific probability estimates, is also described. Comparative results are presented which show the improvements due to the context-specific modelling.