ISCA Archive Eurospeech 1993
ISCA Archive Eurospeech 1993

Performance comparison of machine and human speaker verification

M. Mehdi Homayounpour, J. Philippe Goldman, Gérard Chollet, Jacqueline Vaissière

This paper concerns the problem of speech variability in Automatic Speaker Verification (ASV) systems. The performance of our ASV system was compared with the performance of human listeners on material spoken in four different emotional modes (neutral, happiness, fatigue and anger). Resulting variation in pitch contours, speaking rate, intensity, formant values, etc. had a relatively high influence on the performance of our ASV system and that of human listeners. In another experiment, a speaker verification task was done automatically and by human listeners on a telephone data base: 24 speakers tried to imitate two reference speakers. The last experiment was done by artificially varying the prosodic and spectral characteristics of speech as a means for simulating variation due to emotion and as a means to augment limited data bases for evaluating and comparing different ASV systems.

Keywords: speaker verification, emotion, imitation, and TD-PSOLA