A novel distance measure for distance-based speaker segmentation is proposed. This distance measure is non- parametric, in contrast to common distance measures used in speaker segmentation systems, which often assume a Gaussian distribution when measuring the distance between1two audio segments. This distance measure is essentially a k-nearest- neighbor distance measure. Non-vowel segment removal in pre- processing stage is also proposed. Speaker segmentation performance is tested on artificially created conversations from the TIMIT database and two AMI conversations. For short window lengths, Missed Detection Rated is decreased significantly. For moderate window lengths, a decrease in both Missed Detection and False Alarm Rates occur. The computational cost of the distance measure is high for long window lengths.
Index Terms: speaker segmentation, distance measure, k-nearest-neighbor