ISCA Archive Interspeech 2023
ISCA Archive Interspeech 2023

Monaural Speech Separation Method Based on Recurrent Attention with Parallel Branches

Xue Yang, Changchun Bao, Xu Zhang, Xianhong Chen

In many speech separation methods, the contextual information contained in the feature sequence is mainly modeled by recurrent layer and/or self-attention mechanism. However, how to combine these two powerful approaches more effectively needs to be explored. In this paper, a recurrent attention with parallel branches is proposed to first fully exploit the contextual information contained in the time-frequency (T-F) features. Then, this information is further modeled by the recurrent modules in a conventional manner. Specifically, the proposed recurrent attention with parallel branches uses two attention modules stacked sequentially. Each attention module has two parallel branches of self-attention to model dependencies along two axes and one convolutional layer for feature fusion. Thus, the contextual information contained in the T-F features can be fully exploited and further modeled by the recurrent modules. Experimental results showed the effectiveness of our proposed method.