ISCA Archive Interspeech 2023
ISCA Archive Interspeech 2023

A Study on Using Duration and Formant Features in Automatic Detection of Speech Sound Disorder in Children

Si-Ioi Ng, Cymie Wing-Yee Ng, Tan Lee

Speech sound disorder (SSD) in children is manifested by persistent articulation and phonological errors on specific phonemes of a language. Automatic SSD detection can be done using features extracted from deep neural network models. Interpretability of such learned features is a major concern. Motivated by clinical knowledge, the use of duration and formant features for SSD detection is investigated in this research. Acoustical analysis is performed to identify the acoustic features that differentiate between the speech of typical and disordered children. On the task of SSD detection in Cantonese-speaking children, the duration features are found to outperform the formant features and surpass previous methods that use paralinguistic feature set and speaker embeddings. Specifically, the duration features achieve a mean unweighted average recall of 71.0%. The results enhance the understanding of SSD, and motivate further use of temporal information of child speech in SSD detection.