ISCA Archive Interspeech 2023
ISCA Archive Interspeech 2023

Weighted Von Mises Distribution-based Loss Function for Real-time STFT Phase Reconstruction Using DNN

Nguyen Binh Thien, Yukoh Wakabayashi, Yuting Geng, Kenta Iwai, Takanobu Nishiura

This paper presents improvements to real-time phase reconstruction using deep neural networks (DNNs). The advantage of DNN-based approaches in phase reconstruction is that they can leverage prior knowledge from data and are adaptable to real-time applications by using causal models. However, conventional DNN-based methods do not consider the varying properties of the phase at different time-frequency bins. Our paper proposes loss functions for phase reconstruction that incorporate frequency-specific and amplitude weights to distinguish the importance of phase elements based on their properties. We also use an extension of the group delay to improve the phase connections along the frequency. To improve the generalization, we augment the data by randomly shifting the signals in the time domain for each epoch during training. Experimental results show the superior performance of the proposed methods compared to conventional DNN-based and non-DNN real-time phase reconstruction methods.