Over the past decade, deep learning has demonstrated its effectiveness and keeps setting new records in a wide variety of tasks. However, good model performance usually leads to a huge amount of parameters and extremely high computational complexity which greatly limit the use cases of deep learning models, particularly in embedded systems. Therefore, model compression is getting more and more attention. In this paper, we propose a compression strategy based on iterative pruning and knowledge distillation. Specifically, in each iteration, we first utilize a pruning criterion to drop the weights which have less impact on performance. Then, the model before pruning is used as a teacher to fine-tune the student which is the model after pruning. After several iterations, we get the final compressed model. The proposed method is verified on gated convolutional recurrent network (GCRN) and long short-term memory (LSTM) for single-channel speech enhancement task. Experimental results show that the proposed compression strategy can dramatically reduce the model size by 40x without significant performance degradation for GCRN.