ISCA Archive CHiME 2024
ISCA Archive CHiME 2024

System Description of NJU-AALab's Submission for the CHiME-8 NOTSOFAR-1 Challenge

Qinwen Hu, Tianchi Sun, Xinan Chen, Xiaobin Rong, Jing Lu

The paper describes the NJU-AALab team’s entry to the Natural Office Talkers in Settings of Far-field Audio Recordings (NOTSOFAR-1) task, part of the CHiME-8 Challenge. The approach uses a pipeline consisting of a continuous speech separation (CSS) module based on TF-GridNet, the state-of-the-art speech separation model, a speech recognition module utilizing Whisper ”large-v2”, and a speaker diarization module based on the multilevel normalized maximum eigengap-based spectral clustering (NME-SC) method. Our proposed system achieves a time-constrained minimum permutation word error rate (tcpWER) of 33.5% on the evaluation set and 36.4% on the development set of the NOTSOFAR-1 real recordings, which outperforms the baseline by a large margin and ranks 3rd in the single track of the challenge.