This paper discusses the design of a modality-independent MMI system architecture. In the architecture, the MMI system is divided into three modules: the document server module which holds dialog scenarios and contents, the dialog manager which controls dialog flow, and the front-end module which manages the users inputs and the systems outputs. This division enables us to reuse the document server module and the dialog manager when introducing new terminals with different types of modalities because they are independent of modalities. Moreover, we propose an MMI description language XISL. Since it has the flexibility to describe the users inputs and the systems outputs, it can be used for describing interactions on various terminals without introducing a new description language and its processor. We show a prototype system of an online shopping application implemented on our architecture, and compare the difference between XISL and other languages.