ISCA Archive Interspeech 2023
ISCA Archive Interspeech 2023

Recursive Sound Source Separation with Deep Learning-based Beamforming for Unknown Number of Sources

Hokuto Munakata, Ryu Takeda, Kazunori Komatani

We propose a recursive separation model for an unknown number of sound sources based on deep learning-based beamforming. Recursive separation models have been investigated as a way to separate a mixture signal composed of an unknown number of sources in a single-channel condition. The mixture signal is separated with residual information in a recursive manner. Although the recursive separation model can be extended to a multi-channel condition using a beamforming-based filter, the separation performance is degraded because the beamforming-based filter tends to accumulate estimation errors in the recursions. To address this problem, we introduce a local Gaussian model (LGM)-based recursive separation model. The proposed method mitigates the accumulation of errors by reusing estimated parameters and applying only one filter to the mixture signal. Experimental results show that our proposed method outperforms a separation model using an accumulative filter.