ISCA Archive IberSPEECH 2024
ISCA Archive IberSPEECH 2024

HiTZ-Aholab Speaker Diarization System for Albayzin Evaluations of IberSPEECH 2024

Christoforos Souganidis, Gemma Meseguer, Asier Herranz, Inma Hernáez Rioja, Eva Navas, Ibon Saratxaga

This paper describes the Speaker Diarization (SD) systems submitted by HiTZ-Aholab to IberSPEECH 2024 for the SD task of the Speaker Diarization and Identity Assignment Challenge of Albayzin Evaluations. We presented three systems based on pyannote.audio 3.0, an open-source Python toolkit developed for various speech processing tasks. For all three submitted systems we fine-tuned the pre-trained segmentation model (v3.0), focusing on minimizing the Diarization Error Rate (DER) and each of its components, i.e., False Alarm (FA), Missed Detection (MISS) and Speaker Confusion Rate (CONF). Subsequently, we applied the pre-trained SD pipeline (v3.0) to generate the output for each of the four fine-tuned models, and fused these outputs in various combinations. For the output fusion we applied the majority voting algorithm DOVER-LAP. Our primary system obtained a DER of 14.98%, while our two contrastive systems a DER of 14.95% and 15.19%.