ISCA Archive Interspeech 2025
ISCA Archive Interspeech 2025

Automatic detection of speech sound disorders in German-speaking children: augmenting the data with typically developed speech

Darline Monika Marx, Marco Matassoni, Alessio Brutti

Speech Sound Disorders (SSD) are common among children, affecting their academic, social, and emotional development. Traditional diagnostic methods are based on speech-language pathologists, making them resource intensive. Due to the global shortage of experts and increasing demand, exploring deep-learning tools is crucial. Adapting a multi-task framework to fine-tune a pre-trained multilingual Wav2Vec model, this study tackles Automatic Speech Recognition and SSD classification for German children using a custom dataset. We show that incorporating public out-of-domain datasets improves robustness and generalizability. Interestingly, combining pathological and typical speech data with mis-pronunciations benefits the performance in terms of speech recognition and SSD detection. Finally we investigate a two-step training of the model that further improves the overall performance.