ISCA Archive Interspeech 2015
ISCA Archive Interspeech 2015

Large vocabulary children's speech recognition with DNN-HMM and SGMM acoustic modeling

Diego Giuliani, Bagher BabaAli

In this paper, large vocabulary children's speech recognition is investigated by using the Deep Neural Network - Hidden Markov Model (DNN-HMM) hybrid and the Subspace Gaussian Mixture Model (SGMM) acoustic modeling approach. In the investigated scenario training data is limited to about 7 hours of speech from children in the age range 7-13 and testing data consists in read clean speech from children in the same age range. To tackle inter-speaker acoustic variability, speaker adaptive training, based on feature space maximum likelihood linear regression, as well as vocal tract length normalization are adopted. Experimental results show that with both DNN-HMM and SGMM systems very good recognition results can be achieved although best results are obtained with the DNN-HMM system.


doi: 10.21437/Interspeech.2015-378

Cite as: Giuliani, D., BabaAli, B. (2015) Large vocabulary children's speech recognition with DNN-HMM and SGMM acoustic modeling. Proc. Interspeech 2015, 1635-1639, doi: 10.21437/Interspeech.2015-378

@inproceedings{giuliani15_interspeech,
  author={Diego Giuliani and Bagher BabaAli},
  title={{Large vocabulary children's speech recognition with DNN-HMM and SGMM acoustic modeling}},
  year=2015,
  booktitle={Proc. Interspeech 2015},
  pages={1635--1639},
  doi={10.21437/Interspeech.2015-378},
  issn={2958-1796}
}