ISCA Archive Interspeech 2014
ISCA Archive Interspeech 2014

Improving native accent identification using deep neural networks

Mingming Chen, Zhanlei Yang, Hao Zheng, Wenju Liu

In this paper, we utilize deep neural networks (DNNs) to automatically identify native accents in English and Mandarin when no text, speaker or gender information is available for the speech data. Compared to the Gaussian mixture model (GMM) based conventional methods, the proposed method benefits from two main advantages: first, DNNs are discriminative models which can provide better discrimination on confusion regions of different accents; second, they have the hierarchical nonlinear feature extraction capability which can learn discriminative high-level features for the specified task. In detail, the speech data of all accents is used to train DNNs, and in the testing stage, we first identify the accent label of each frame, then determine the sentence label by the majority voting conducted on the frame labels. The experiments on accented English and Mandarin corpus demonstrate that, compared to the GMM based methods, our proposed method can significantly improve the frame accuracy as well as sentence accuracy on the test set. Moreover, the performance of the proposed method can be further improved by using context information.

doi: 10.21437/Interspeech.2014-486

Cite as: Chen, M., Yang, Z., Zheng, H., Liu, W. (2014) Improving native accent identification using deep neural networks. Proc. Interspeech 2014, 2170-2174, doi: 10.21437/Interspeech.2014-486

  author={Mingming Chen and Zhanlei Yang and Hao Zheng and Wenju Liu},
  title={{Improving native accent identification using deep neural networks}},
  booktitle={Proc. Interspeech 2014},