ISCA Archive Interspeech 2025
ISCA Archive Interspeech 2025

On Enhancing the Performance of Children's ASR Task in Limited Data Scenario

Ankita Ankita, Shambhavi Shambhavi, Syed Shahnawazuddin

In this paper, we have documented our efforts towards developing a robust children's automatic speech recognition (ASR) system in limited data scenario. At first, we have explored the effect of in-domain data augmentation so as to deal with limitations posed by data scarcity. This also helps in developing a competitive baseline ASR system. Next, we have studied the affect of modeling glottal activity parameters along with spectrum-based front-end acoustic features like the Mel-frequency cepstral coefficients (MFCC). Finally, the impact of feature normalization through feature-space maximum likelihood linear regression (fMLLR) is explored. As a consequence of applying fMLLR and then concatenating the normalized MFCC features with glottal activity parameters, a relative reduction in character error rate by 40% over the baseline is obtained.