ISCA Archive Interspeech 2023
ISCA Archive Interspeech 2023

DNN-based Parameter Estimation for MVDR Beamforming and Post-filtering

Minseung Kim, Sein Cheong, Jong Won Shin

Multi-channel speech enhancement systems usually consist of spatial filtering such as minimum-variance distortionless-response (MVDR) beamforming and post-processing, which require acoustic parameters including relative transfer function (RTF), noise spatial covariance matrix (SCM), and a priori and a posteriori signal-to-noise ratios (SNRs). In this paper, we propose a deep neural network (DNN)-based parameter estimation for MVDR beamforming and post-filtering. Specifically, we propose to use a DNN to estimate the interchannel phase differences of the clean speech and the speech presence probability (SPP), which are used to estimate the RTF and the noise SCM for MVDR beamforming. As for the post-processing, we adopt the iDeepMMSE framework in which another DNN is employed to estimate the a priori SNR, speech power spectral density, and SPP used to compute spectral gains. The proposed method outperformed several previous approaches especially in the PESQ scores for the CHiME-4 dataset.