ISCA Archive SLAM 2013
ISCA Archive SLAM 2013

QCompere @ REPERE 2013

Hervé Bredin, Johann Poignant, Guillaume Fortier, Makarand Tapaswi, Viet-Bac Le, Anindya Roy, Claude Barras, Sophie Rosset, Achintya Sarkar, Qian Yang, Hua Gao, Alexis Mignon, Jakob Verbeek, Laurent Besacier, Georges Quénot, Hazim Kemal Ekenel, Rainer Stiefelhagen

We describe QCompere consortium submissions to the REPERE 2013 evaluation campaign. The REPERE challenge aims at gathering four communities (face recognition, speaker identification, optical character recognition and named entity detection) towards the same goal: multimodal person recognition in TV broadcast. First, four mono-modal components are introduced (one for each foregoing community) constituting the elementary building blocks of our various submissions. Then, depending on the target modality (speaker or face recognition) and on the task (supervised or unsupervised recognition), four different fusion techniques are introduced: they can be summarized as propagation-, classifier-, rule- or graph-based approaches. Finally, their performance is evaluated on REPERE 2013 test set and their advantages and limitations are discussed.

Index Terms: speaker identification, face recognition, named entity detection, video optical character recognition, multimodal fusion

Cite as: Bredin, H., Poignant, J., Fortier, G., Tapaswi, M., Le, V.-B., Roy, A., Barras, C., Rosset, S., Sarkar, A., Yang, Q., Gao, H., Mignon, A., Verbeek, J., Besacier, L., Quénot, G., Ekenel, H.K., Stiefelhagen, R. (2013) QCompere @ REPERE 2013. Proc. First Workshop on Speech, Language and Audio in Multimedia (SLAM 2013), 49-54

  author={Hervé Bredin and Johann Poignant and Guillaume Fortier and Makarand Tapaswi and Viet-Bac Le and Anindya Roy and Claude Barras and Sophie Rosset and Achintya Sarkar and Qian Yang and Hua Gao and Alexis Mignon and Jakob Verbeek and Laurent Besacier and Georges Quénot and Hazim Kemal Ekenel and Rainer Stiefelhagen},
  title={{QCompere @ REPERE 2013}},
  booktitle={Proc. First Workshop on Speech, Language and Audio in Multimedia (SLAM 2013)},