ISCA Archive Interspeech 2007
ISCA Archive Interspeech 2007

Trainable speaker diarization

Hagai Aronowitz

This paper presents a novel framework for speaker diarization. We explicitly model intra-speaker inter-segment variability using a speaker-labeled training corpus and use this modeling to assess the speaker similarity between speech segments. Modeling is done by embedding segments into a segment-space using kernel-PCA, followed by explicit modeling of speaker variability in the segment-space. Our framework leads to a significant improvement in diarization accuracy. Finally, we present a similar method for bandwidth classification.