ISCA Archive Interspeech 2015
ISCA Archive Interspeech 2015

Improved phase reconstruction in single-channel speech separation

Florian Mayer, Pejman Mowlaee

Conventional single-channel source separation (SCSS) algorithms are mostly focused on estimating the spectral amplitude of the underlying sources extracted from a mixture. The importance of phase information in source separation and its positive impact on improving the achievable performance is not adequately studied yet. In this work, we propose a phase estimation method to enhance the spectral phase of the underlying signals in SCSS framework. The proposed method relies on multi-pitch estimation and phase decomposition followed by applying temporal smoothing filters on the unwrapped mixture phase. We consider the combination of the proposed phase estimator with ideal binary mask and non-negative matrix factorization, as two well-known SCSS methods for separating the spectral amplitudes. Our results show that certain improvements in quality and intelligibility is achievable via replacing the mixture phase with the estimated one when reconstructing the sources.