This paper discusses the requirements for developing a multimodal spoken dialogue system for mobile phone applications. Since visual output as part of the multimodal system is limited through the restricted screen size of mobile phones, research in the field of information visualisation for small screen devices are discussed and combinations of these techniques with spoken output are sketched. For development and evaluation of multimodal dialogue systems for mobile phones a testbed is currently under development. The architecture of the system is described. Design decisions for the implementation of a prototypic but realistic application: an information retrieval system for the real estate domain, are pointed out. The system builds the basis for future field trials.