ISCA Archive SPSC 2025
ISCA Archive SPSC 2025

Are audio DeepFake detection models polyglots?

Bartomiej Marek, Piotr Kawa, Piotr Syga
Since the majority of audio DeepFake (DF) detection methods are trained on English-centric datasets, their applicability to non-English languages remains largely unexplored. In this work, we introduce a benchmark for the multilingual audio DF detection challenge by evaluating various adaptation strategies. Our experiments focus on analyzing models trained on English benchmark datasets, as well as intra-linguistic (same-language) and cross-linguistic adaptation approaches. Our results indicate considerable variations in detection efficacy, highlighting the difficulties of multilingual settings. We show that limiting the training dataset to English negatively impacts the efficacy, while using even a small amount of data in the target language proves more beneficial for detection than adding larger volumes of data from multiple non-target languages combined.