Non-intrusive speech quality assessment has been a crucial task for speech processing. In recent years, methods based on deep neural network have achieved the start-of-the-art performance for non-intrusive speech quality assessment. However, the scarcity of annotated data is usually the main challenge for training robust speech quality assessment networks. In this paper, we proposed an impairment representation learning approach to pre-train the network on a large amount of simulated data without MOS annotation. Then we further fine-tune the pre-trained model for the MOS prediction task on annotated data. The experimental results show that the proposed pre-training methods can significantly improve the performance for speech quality assessment, especially when the annotated training data is limited. Besides, the proposed method significantly outperforms the baseline system of ConferencingSpeech 2022 Challenge.