This paper reports on an investigation into the relative performance of several approaches to automatic Speaker Recognition (SR). The techniques which were examined were; Dynamic Time Warping (DTW), Vector Quantisation (VQ), and Hidden Markov Modelling (HMM). In order to test the various techniques, two sets of test speech data were created. The first set of test data was acquired under controlled conditions, and the speech data was digitised and manually end-pointed. This set of data was used as a reference test for the algorithms which were devised. The second body of data was acquired over a 'dialled-up' telephone link. In this case, the data was automatically prompted-for, acquired and end-pointed in real-time by computer. This provided test data which permitted the testing of the speaker recognition systems under realistic operating conditions. Using the LPC-derived cepstral coefficients to represent the test speech, three text-dependent speaker-recognition systems were produced. On testing, using several codebook sizes, the VQ-based system proved to have the best performance, with the HMM-based system and the DTW-based system producing similar, but less successful, results.
Keywords: Speaker Recognition, Speaker Identification, Speaker Verification, Speech Databases