ISCA Archive ICSLP 2002
ISCA Archive ICSLP 2002

Comparative evaluation of CASA and BSS models for subband cocktail-party speech separation

Frédéric Berthommier, Seungjin Choi

For speech segregation, a recurrent blind separation model (BSS) is tested together with a Computational Auditory Scene Analysis (CASA) model, which is based on the localisation cue and the evaluation of the Time Delay Of Arrival (TDOA). The test database is composed of 332 binary mixture sentences recorded in stereo with a static set-up. These are truncated at 1 second for the simulations. For applying the two models, we divide the frequency domain into a variable number of subbands, which are processed independently. Then, we evaluate the gain, using reference signals recorded in isolation. After a careful analysis, we find similar gains of about 2-3dB for both methods. The variation of the number of subbands allows an optimisation, and we obtain a significant peak at 4 subbands for the CASA model, as well as a maximum at 2 subbands for the BSS model.