ISCA Archive Eurospeech 1995
ISCA Archive Eurospeech 1995

Separation of speakers in audio data

Jesper O. Olsen

Speaker separation is a technique with potentially many applications, for instance as an aid in browsing audio documents. This paper describes a novel speaker separation method, where speaker models are created without having any training data available in advance. The method was tested on realistic unconstrained telephone conversations, and ergodic Hidden Markov Models used for speaker modelling. The overall results were sequence and duration accuracies of respectively 87% and 94%, when no prior knowledge of the speakers was used (i.e. training data). Keywords: Speaker Separation, Speaker Recognition, Hidden Markov Models.