ISCA Archive Odyssey 2022
ISCA Archive Odyssey 2022

DP-Means: An Efficient Bayesian Nonparametric Model for Speaker Diarization

Yijun Gong, Xiao-Lei Zhang

Recently, Bayesian probabilistic model based clustering gets superior performance in speaker diarization, however, it is much more complicated than widely used efficient clustering algorithms, which is not convenient for some real-life scenarios. In this paper, we propose a covariance-asymptotic variant to Dirichlet process mixture models (DPMM), named Dirichlet process means (DP-means) clustering for speaker diarization. Similar to Bayesian nonparametric models (e.g. DPMM), DP-means can constantly generate new clusters during clustering, which is suitable to the speaker diarization problem where the number of speakers is determined on-the-fly. Different from Bayesian nonparametric models, DP-means is a hard clustering that does not need to optimize the variance of mixtures, which is efficient for real-world problems. We further exploited an initialization method to obtain the prior cluster centroids for DP-means. Experimental results on the CALLHOME, AMI and DIHARD III corpora show that the proposed method is more efficient than the state-of-the-art speaker clustering methods with slight performance degradation.