The aim of this paper is to introduce multimodal identity verification techniques to the Interactive Dialogue Systems society, and to identify possible applications. The multimodal identity verication system presented here is based on two biometric modalities (speech and vision) and it uses 3 experts (using voice, frontal face and profile images), based on these two modalities, in parallel. Each expert delivers as output a scalar number, called score, stating how well the claimed identity is veried. A fusion module receiving as input the 3 scores has to take a binary decision: accept or reject identity. We have solved this fusion problem using a wide range of statistical pattern recognition techniques. The performances of the different fusion modules have been evaluated and compared on a multimodal database, containing both vocal and visual modalities.