Zero-shot performance of state-of-the-art automatic speech recognition (ASR) significantly declines on pediatric patients with speech sound disorders (SSDs) due to deviations in phonetic pronunciation. To address this, we train a subject-agnostic ASR system on 77 minutes of pediatric SSD transcribed data, which improved zero-shot ASR by 67.48%. Given the scarcity of data and privacy concerns with children's data, we study the suitability of voice conversion (VC) and text-to-speech (TTS) to synthesize disorder-reflective samples. Our ASR system surpassed zero-shot by 71.72% when leveraging TTS and showed potential for privacy preservation when using VC. Notably, pre-training on synthetic samples alone reduces the required real SSD data to 50 minutes (i.e., 65% of the data), while achieving metrics comparable to the model finetuned with all SSD samples. This study enables ASR technologies to assist individuals with SSDs and facilitates automatic transcription of speech therapy sessions.