The capability profiles of commercial automatic speech recognition (ASR) systems are rapidly improving in terms of vocabulary size, noise robustness and user population. Most contemporary applications of ASR use interfaces relying solely on the speech mode of interaction (over telephone channels for example). Many applications will, however, benefit from using speech input in conjunction with other interaction devices such as trackballs, keyboards and touch-screens. In this paper, we present an interface modelling approach based on a critical path analysis of the interface design. The approach has been developed to model multi-modal interactions using combinations of input devices. Degradation of unit performances allow the effects of environmental factors on the overall interface performance to be predicted. The model is verified by comparison with experimental trials carried out on a number of multi-modal applications. It is demonstrated that the model is able to predict the main performance metric (task completion time) to within 10% of the experimental values.