Automatic Speech Recognition (ASR) encounters unique challenges when dealing with children's speech, mainly due to the scarcity of available data. Training large ASR models with constrained data presents a significant challenge. To address this, fine-tuning strategy is frequently employed. However, fine-tuning an entire large pre-trained model with limited children's speech data may overfit leading to decreased performance. This study offers a granular evaluation of children's ASR fine-tuning, departing from conventional whole-network tunning. We present a partial fine-tuning approach spotlighting the importance of the Encoder and Feedforward Neural Network modules in Transformer-based models. Remarkably, this method surpasses the efficacy of whole-model fine-tuning, with a relative word error rate improvement of 9\% when dealing with limited data. Our findings highlight the critical role of partial fine-tuning in advancing children's ASR model development.