ISCA Archive Interspeech 2007
ISCA Archive Interspeech 2007

A robust stopping criterion for agglomerative hierarchical clustering in a speaker diarization system

Kyu J. Han, Shrikanth S. Narayanan

Agglomerative hierarchical clustering (AHC) is an unsupervised classification strategy of merging the closest pair of clusters recursively, and has been widely used in speaker diarization systems to classify speech segments by speaker identity. The most critical part in AHC is how to automatically stop the recursive process at the point when clustering error rate reaches its lowest possible value, for which a BIC-based stopping criterion has been widely used. However, this criterion is not robust to data source variation. In this paper, we examine the criterion to establish the cause for the robustness issue and, based on this, propose an improved stopping criterion. Experimental results based on meeting conversation excerpts randomly chosen from various meeting speech corpora indicate that the proposed criterion is superior to the BIC-based one, showing that clustering error rate is improved on average by 7.28% (absolute) and 34.16% (relative).