This technical report describes the NPU-TEA submissions for the CHiME-8 MMCSG Challenge. In this challenge, our submitted systems are categorized into streaming and non-streaming systems based on latency thresholds. Only audio information is utilized in the submissions. For the streaming system, the same framework as the first baseline system is used. It covers 150ms-, 350ms- and 1000ms-latency thresholds. For the non-streaming system (greater than 1000ms), we submit three different systems. Experimental results indicate that the best streaming system achieved around 4% improvement over the baseline system reported in the challenge website.