This paper presents the details of the BIT-MI deep learning-based model submitted to the ConferencingSpeech challenge 2022. Due to the large time and labor costs of subjective tests, the challenge aims to promote the non-intrusive objective quality assessment research for speech communication and targets for effective evaluation on the speech quality of online conferencing applications. We propose a novel deep learning-based model involving a new convolution neural network (CNN) architecture, a bidirectional long short term memory (BLSTM), an average pooling and a range clipping method. Meanwhile, we construct a two-parts target function combining the mean square error (MSE) and pearson correlation coefficient (PCC) between predictions and labels in order to jointly optimize the performance of the assessment model from both aspects. Experiment results show that the proposed model significantly outperforms the official baseline system both on the validation and test set.