In clinical dictation, utterances after automatic speech recognition (ASR) without explicit punctuation marks may lead to the misunderstanding of dictated reports. To provide a precise and understandable clinical report with ASR, automatic punctuation restoration (APR) is required. Considering a practical scenario, we propose a fast and lightweight pre-trained model for Chinese medical punctuation restoration based on the ‘pre-training and fine-tuning’ paradigm. In this work, we distill pre-trained models by incorporating supervised contrastive learning and a novel auxiliary pre-training task (Punctuation Mark Prediction) to make it well-suited for punctuation restoration. We then reformulate APR as a slot tagging problem in the fine-tuning stage to bridge the gap between pre-training and fine-tuning. Our experiments on various distilled models reveal that our model can achieve 95% performance with a 10% model size relative to the state-of-the-art Chinese RoBERTa.