The paper describes the NJU-AALab team’s entry to the Natural Office Talkers in Settings of Far-field Audio Recordings (NOTSOFAR-1) task, part of the CHiME-8 Challenge. The approach uses a pipeline consisting of a continuous speech separation (CSS) module based on TF-GridNet, the state-of-the-art speech separation model, a speech recognition module utilizing Whisper ”large-v2”, and a speaker diarization module based on the multilevel normalized maximum eigengap-based spectral clustering (NME-SC) method. Our proposed system achieves a time-constrained minimum permutation word error rate (tcpWER) of 33.5% on the evaluation set and 36.4% on the development set of the NOTSOFAR-1 real recordings, which outperforms the baseline by a large margin and ranks 3rd in the single track of the challenge.