ISCA Archive Interspeech 2020
ISCA Archive Interspeech 2020

Generative Adversarial Network Based Acoustic Echo Cancellation

Yi Zhang, Chengyun Deng, Shiqian Ma, Yongtao Sha, Hui Song, Xiangang Li

Generative adversarial networks (GANs) have become a popular research topic in speech enhancement like noise suppression. By training the noise suppression algorithm in an adversarial scenario, GAN based solutions often yield good performance. In this paper, a convolutional recurrent GAN architecture (CRGAN-EC) is proposed to address both linear and nonlinear echo scenarios. The proposed architecture is trained in frequency domain and predicts the time-frequency (TF) mask for the target speech. Several metric loss functions are deployed and their influence on echo cancellation performance is studied. Experimental results suggest that the proposed method outperforms the existing methods for unseen speakers in terms of echo return loss enhancement (ERLE) and perceptual evaluation of speech quality (PESQ). Moreover, multiple metric loss functions provide more freedom to achieve specific goals, e.g., more echo suppression or less distortion.