The communication with robotic assistants or companions is a challenging new domain for the use of dialog systems. In contrast to classical spoken language interfaces users interact with mobile robots mostly in a multi-modal way. In this paper we will present the integration of several modalities in the dialog system of BIRON - the Bielefeld Robot Companion. Besides speech as the main modality the system integrates deictic gestures and visual scene information in order to resolve object references in a task oriented dialog. We will present example interactions with BIRON and first qualitative results from the "home-tour" scenario defined within the COGNIRON project.