ISCA Archive Interspeech 2024
ISCA Archive Interspeech 2024

Contrastive Learning and Inter-Speaker Distribution Alignment Based Unsupervised Domain Adaptation for Robust Speaker Verification

Zuoliang Li, Wu Guo, Bin Gu, Shengyu Peng, Jie Zhang

Unsupervised domain adaptation (UDA) can tackle the mismatch between the source and target domains for real-world speaker verification applications. In this paper, we propose an UDA method by leveraging the target-domain data through a self-supervised method. Firstly, we use momentum contrastive learning to effectively utilize the latent speaker labels in the target domain, enhancing intra-speaker compactness and inter-speaker separability simultaneously. Secondly, we improve the inter-speaker feature distribution alignment loss, ensuring the stability of the source-domain statistics and mitigating the impact of false negative pairs. These two methods are further combined with conventional supervised learning in the source domain. Using Voxceleb2 as the source domain and CN-Celeb1 as the target domain, experimental results demonstrate the effectiveness of our proposed method.