ISCA Archive Interspeech 2006
ISCA Archive Interspeech 2006

On the fusion of prosody, voice spectrum and face features for multimodal person verification

M. Farrús, A. Garde, P. Ejarque, J. Luque, Javier Hernando

Multimodal person recognition systems normally use short-term spectral features as voice information. In this paper prosodic information is added to a system based on face and voice spectrum features. By using two fusion techniques, support vector machines and matcher weighting, different fusion strategies based on the fusion of monomodal scores in several steps are proposed. The performance of the system is clearly improved when the prosodic information is added and the best results are achieved when prosodic scores are previously fused and the resulting scores are fused again with spectral and facial scores. Speech and face scores have been obtained upon Switchboard-I and XM2VTS databases respectively.