In this paper, a keyword spotting unit (KeySpot) is first described. KeySpot can recognize 200 keywords in continuous speech or 1000 isolatedly spoken words. The unit also contains an adaptive noise canceller. Next, a multimodal, keyword-based spoken dialogue system (MultiksDial) including KeySpot is described. The system provides multiple input channels of spontaneous speech and touch, as well as multiple output channels of graphics and voice response. The system also provides three sensors to detect the user's behavior and to plan interactive strategies. Better usability is shown on comparison with an ordinary touch-screen system through the experiment of a directory guidance task.