ISCA Archive Interspeech 2022
ISCA Archive Interspeech 2022

On the Use of Deep Mask Estimation Module for Neural Source Separation Systems

Kai Li, Xiaolin Hu, Yi Luo

Most of the recent neural source separation systems rely on a masking-based pipeline where a set of multiplicative masks are estimated from and applied to a signal representation of the input mixture. The estimation of such masks, in almost all network architectures, is done by a single layer followed by an optional nonlinear activation function. However, recent literatures have investigated the use of a deep mask estimation module and observed performance improvement compared to a shallow mask estimation module. In this paper, we analyze the role of such deeper mask estimation module by connecting it to a recently proposed unsupervised source separation method, and empirically show that the deep mask estimation module is an efficient approximation of the so-called overseparation-grouping paradigm with the conventional shallow mask estimation layers.