ISCA Archive Interspeech 2023
ISCA Archive Interspeech 2023

Robust Audio Anti-spoofing Countermeasure with Joint Training of Front-end and Back-end Models

Xingming Wang, Bang Zeng, Suo Hongbin, Yulong Wan, Ming Li

The accuracy and reliability of many speech processing systems may deteriorate under noisy conditions. This paper discusses robust audio anti-spoofing countermeasure for audio in noisy environments. Firstly, we attempt to use a pre-trained speech enhancement model as the front-end module and build a cascaded system. However, the independent denoising process of enhancement models may distort the synthesis artifacts or anti-spoofing related information included in utterances, leading to performance degradation. Therefore, we proposes a new framework for robust audio anti-spoofing by joint training the integrated speech enhancement front-end and anti-spoofing back-end. The final results demonstrate that the joint training framework is more effective than the cascaded framework. Additionally, we propose a cross-joint training scheme, which allows the single-model performance to exceed the result of score level fusion, making the joint framework more effective and efficient.