ISCA Archive Interspeech 2022
ISCA Archive Interspeech 2022

Online Speaker Diarization with Core Samples Selection

Yanyan Yue, Jun Du, Mao-Kui He, YuTing Yeung, Renyu Wang

We propose a novel online speaker diarization approach based on the VBx algorithm which works well on the offline speaker diarization tasks. To efficiently process long-time recordings, we perform the online diarization in a block-wise manner. First, we devise a core samples updating strategy utilizing time penalty function, which can preserve important historical information with a low memory cost. Then we select clustering samples from core samples by stratified sampling to enhance the variability among samples and retain sufficient speaker identity information, which helps VBx to improve classification accuracy on a small amount of data. Finally, we solve the label ambiguity problem by a global constrained clustering algorithm. We evaluate our system on DIHARD and AMI datasets. The experimental results demonstrate that our online approach achieves superior performance compared with the state-of-the-art.