The effective handling of overlapping speech is at the limits of the current state-of-the-art in speaker diarization. This paper presents our latest work in overlap detection. We report the combination of features derived through convolutive non-negative sparse coding and new energy, spectral and voicing-related features within a conventional HMM system. Overlap detection results are fully integrated into our topdown diarization system through the application of overlap exclusion and overlap labelling. Experiments on a subset of the AMI corpus show that the new system delivers significant reductions in missed speech and speaker error. Through overlap exclusion and labelling the overall diarization error rate is shown to improve by 6.4% relative.
Index Terms: speech overlap detection, convolutive nonnegative sparse coding, speaker diarization