ISCA Archive IberSPEECH 2018
ISCA Archive IberSPEECH 2018

In-domain Adaptation Solutions for the RTVE 2018 Diarization Challenge

Ignacio Viñals, Pablo Gimeno, Alfonso Ortega, Antonio Miguel, Eduardo Lleida

This paper tries to deal with domain mismatch scenarios in the diarization task. This research has been carried out in the context of the Radio Televisión Espa˜nola (RTVE) 2018 Challenge at IberSpeech 2018. This evaluation seeks the improvement of the diarization task in broadcast corpora, known to contain multiple unknown speakers. These speakers are set to contribute in different scenarios, genres, media and languages. The evaluation offers two different conditions: A closed one with restrictions in the resources, both acoustic and further knowledge, to train and develop diarization systems, and an open condition without restrictions to check the latest improvements in the state-of-the-art. Our proposal is centered on the closed condition, specially dealing with two important mismatches: media and language. ViVoLab system for the challenge is based on the i-vector PLDA framework: I-vectors are extracted from the input audio according to a given segmentation, supposing that each segment represents one speaker intervention. The diarization hypotheses are obtained by clustering the estimated i-vectors with a Fully Bayesian PLDA, a generative model with latent variables as speaker labels. The number of speakers is decided by comparing multiple hypotheses according to the Evidence Lower Bound (ELBO) provided by the PLDA.

doi: 10.21437/IberSPEECH.2018-45

Cite as: Viñals, I., Gimeno, P., Ortega, A., Miguel, A., Lleida, E. (2018) In-domain Adaptation Solutions for the RTVE 2018 Diarization Challenge. Proc. IberSPEECH 2018, 220-223, doi: 10.21437/IberSPEECH.2018-45

  author={Ignacio Viñals and Pablo Gimeno and Alfonso Ortega and Antonio Miguel and Eduardo Lleida},
  title={{In-domain Adaptation Solutions for the RTVE 2018 Diarization Challenge}},
  booktitle={Proc. IberSPEECH 2018},