ISCA Archive Interspeech 2015
ISCA Archive Interspeech 2015

Human vs machine spoofing detection on wideband and narrowband data

Mirjam Wester, Zhizheng Wu, Junichi Yamagishi

How well do humans detect spoofing attacks directed at automatic speaker verification systems? This paper investigates the performance of humans at detecting spoofing attacks from speech synthesis and voice conversion systems. Two speaker verification tasks, in which the speakers were either humans or machines, were also conducted. The three tasks were carried out with two types of data: wideband (16kHz) and narrowband (8kHz) telephone line simulated data. Spoofing detection by humans was compared to automatic spoofing detection (ASD) algorithms. Listening tests were carefully constructed to ensure the human and automatic tasks were as similar as possible taking into consideration listener's constraints (e.g., fatigue and memory limitations). Results for human trials show the error rates on narrowband data double compared to on wideband data. The second verification task, which included only artificial speech, showed equal overall acceptance rates for both 8kHz and 16kHz. In the spoofing detection task, there was a drop in performance on most of the artificial trials as well as on human trials. At 8kHz, 20% of human trials were incorrectly classified as artificial, compared to 12% at 16kHz. The ASD algorithms also showed a drop in performance on 8kHz data, but outperformed human listeners across the board.