ISCA Archive CHiME 2023
ISCA Archive CHiME 2023

The NWPU-ByteAudio System for CHiME-7 Task 2 UDASE Challenge

Zihan Zhang, Runduo Han, Ziqian Wang, Xianjun Xia, Yijian Xiao, Lei Xie

This paper describes the NWPU-ByteAudio system for CHiME-7 Task 2 - unsupervised domain adaptation for conversational speech enhancement (UDASE). To better make use of the in-domain mixture data, we improve the self-supervised learning (SSL) approach RemixIT with MetricGAN discriminator, resulting in an updated version called RemixIT-G. Under the RemixIT-G framework, we take Uformer+ as the speech enhancement model, which is an improved version of Uformer updated with the MetricGAN discriminator as well. We also apply an unsupervised noise adaptation model to generate noisy speech in the target domain. A perceptual contrast stretching (PCS) method is used to further improve the auditory perception quality of the enhanced speech. Our approach has achieved an SI-SDR of 12.95 and an OVRL-MOS of 3.07 in the CHiME-7 task 2 evaluation set and ranked the 1st place in the challenge.