It is often important to be able to automatically label who spoke when during some audio data. This paper describes two systems for audio segmentation developed at CUED and MIT-LL and evaluates their performance using the speaker diarisation score defined in the 2003 Rich Transcription Evaluation. A new clustering procedure and BIC-based stopping criterion for the CUED system is introduced which improves both performance and robustness to changes in segmentation. Finally a hybrid Plug and Play system is built which combines different parts of the CUED and MIT-LL systems to produce a single system which outperforms both the individual systems.