Speech separation aims to decompose mixed speeches into independent signals. Prior research on monaural timedomain speech separation has made great progress in supervised manners. Almost all of these works are trained on simulated mixed speech signals since obtaining ground truth for real-world mixed signals is problematic. To this end, we propose a novel semi-supervised learning method for speech separation (SSLM-SS), which leverages mixed speeches without ground truth. In particular, for this type of data, we further put forward a non-intrusive separated speech quality prediction network (SSQP-Net) based on self-supervised learning. According to the results, the linear correlation coefficient between the predicted results of SSQP-Net and the ground truth achieves 0.9. Moreover, the performance of SSLM-SS equipped with SSQP-Net exhibits an improvement of 0.2 dB and 1.1 dB compared to the mixture invariant training (MixIT) in the conditions of involving 10% and 50% labeled data respectively, and rivals supervised learning.