ISCA Archive Interspeech 2023
ISCA Archive Interspeech 2023

Adversarial Diffusion Probability Model For Cross-domain Speaker Verification Integrating Contrastive Loss

Xinmei Su, Xiang Xie, Fengrun Zhang, Chenguang Hu

In speaker verification, performance degradation caused by domain mismatch has been a common problem as the test domain lies outside the training distribution. In this paper, we present a novel domain transfer network called Adversarial Diffusion Probabilistic Model (ADPM), to better alleviate this problem. More specifically, ADPM is used to transfer melspectrogram from the source domain into the target domain. To generate the melspectrogram, we propose to regard the diffusion model as the generator and a discriminator is employed for adversarial training. We also explore the contrastive learning objective to retain the context information of source domain. The generated and the original feature maps from the source domain are fed into the ResNet34 network jointly to construct cross-domain speaker verification. We evaluate the proposed techniques on VOiCES dataset, and our best model achieves a relative 8.94% Equal Error Rate (EER) drop compared to the previous adaption methods.