ISCA Archive Interspeech 2014
ISCA Archive Interspeech 2014

Acoustic features for robust classification of Mandarin tones

Hongbing Hu, Stephen A. Zahorian, Peter Guzewich, Jiang Wu

For applications such as tone modeling and automatic tone recognition, smoothed F0 (pitch) all-voiced pitch tracks are desirable. Three pitch trackers that have been shown to give good accuracy for pitch tracking are YAAPT, YIN, and PRAAT. On tests with English and Japanese databases, for which ground truth pitch tracks are available by other means, we show that YAAPT has lower errors than YIN and PRAAT. We also experimentally compare the effectiveness of the three trackers for automatic classification of Mandarin tones. In addition to F0 tracks, a compact set of low-frequency spectral shape trajectories are used as additional features for automatic tone classification. A combination of pitch trajectories computed with YAAPT and spectral shape trajectories extracted from 800ms intervals for each tone results in tone classification accuracy of nearly 77%, a rate higher than human listeners achieve for isolated tonal syllables, and also higher than that obtained with the other two trackers.

doi: 10.21437/Interspeech.2014-334

Cite as: Hu, H., Zahorian, S.A., Guzewich, P., Wu, J. (2014) Acoustic features for robust classification of Mandarin tones. Proc. Interspeech 2014, 1352-1356, doi: 10.21437/Interspeech.2014-334

  author={Hongbing Hu and Stephen A. Zahorian and Peter Guzewich and Jiang Wu},
  title={{Acoustic features for robust classification of Mandarin tones}},
  booktitle={Proc. Interspeech 2014},