Do you read me? - flow of speech effect on speaker recognition systems
Alicja Martinek, Joanna Gajewska, Ewelina Bartuzi-Trokielewicz
Comparing two types of speech – read and spontaneous – poses a significant challenge for speaker verification models. This study examines the impact of these differences on the performance of advanced biometric systems. We conducted tests using two baseline speaker verification models and two spoof-aware approaches to assess their ability to handle variations between read and spontaneous speech. Additionally, we generated synthetic speech using two state-of-the-art Text-to-Speech methods, training the models either on spontaneous or read speech. The results indicate that mixing spontaneous and read speech compared with uniform type of speech yields higher error rates in biometric verification. The situation is similar in both testing scenarios, comparing genuines with impostors and genuines with synthetic speech, regardless of the type of speaker recognition model — base or spoof-aware.