This paper extends upon our previous work using i-vectors for speaker diarization. We examine the effectiveness of spectral clustering as an alternative to our previous approach using K-means clustering and adapt a previously-used heuristic to estimate the number of speakers. Additionally, we consider an iterative optimization scheme and experiment with its ability to improve both cluster assignments and segmentation boundaries in an unsupervised manner. Our proposed methods attain results similar to those of a state-of-the-art benchmark set on the multi-speaker CallHome telephone corpus.
Index Terms: speaker diarization, factor analysis, Total Variability, spectral clustering