ISCA Archive Interspeech 2020
ISCA Archive Interspeech 2020

Self-Supervised Spoofing Audio Detection Scheme

Ziyue Jiang, Hongcheng Zhu, Li Peng, Wenbing Ding, Yanzhen Ren

With the development of deep generation technology, spoofing audio technology based on speech synthesis and speech conversion is closer to reality, which challenges the credibility of the media in social networks. This paper proposes a self-supervised spoofing audio detection scheme(SSAD). In SSAD, eight convolutional blocks are used to capture the local feature of the audio signal. The temporal convolutional network (TCN) is used to capture the context features and realize the operation in parallel. Three regression workers and one binary worker are designed to achieve better performance in fake and spoofing audio detection. The experimental results on ASVspoof 2019 dataset show that the detection accuracy of SSAD outperforms the state-of-art. It shows that the self-supervised method is effective for the task of spoofing audio detection.