ISCA Archive Interspeech 2025
ISCA Archive Interspeech 2025

Deep-Simplex Multichannel Speech Separation

Tzlil Avidan, Bracha Laufer-Goldshtein

Numerous methods exist for sound source separation, leveraging either classical signal processing or deep learning approaches. While deep-learning-based models often outperform conventional methods, they require large training datasets and struggle to generalize to new settings. To address this, we propose Deep-Simplex, a deep prior-based method that reconstructs the probability simplex of speaker activity over time. This global activity probability guides the estimation of a local mask per frequency, identifying the dominant speaker in each time-frequency (TF) bin. We then use this mask for both spatial and spectral separation. Experimental results demonstrate that Deep-Simplex outperforms competing baselines in different reverberation conditions.