ISCA Archive Interspeech 2014
ISCA Archive Interspeech 2014

Speech emotion recognition with cross-lingual databases

Bo-Chang Chiou, Chia-Ping Chen

In this paper, we investigate cross-lingual automatic speech emotion recognition. The basic idea is that since the emotion recognition system is based on the acoustic features only, it is possible to combine data in different languages to improve the recognition accuracy. We begin with the construction of a Mandarin database of emotional speech, which is similar to the well-known Berlin Database of Emotional Speech (EMO-DB) in the composition and size. In order to reduce the variability due to different languages and different speakers, we propose to apply histogram equalization as a data normalization method. Recognition systems based on support vector machines have been evaluated on EMO-DB. Compared to the baseline system without multi-lingual databases and data normalization, the proposed system has achieved a relative improvement of 39.9% in the emotion recognition accuracy, from 86.2% to 91.7%. The accuracy is among the best known results reported on EMO-DB, if not the best.

doi: 10.21437/Interspeech.2014-136

Cite as: Chiou, B.-C., Chen, C.-P. (2014) Speech emotion recognition with cross-lingual databases. Proc. Interspeech 2014, 558-561, doi: 10.21437/Interspeech.2014-136

  author={Bo-Chang Chiou and Chia-Ping Chen},
  title={{Speech emotion recognition with cross-lingual databases}},
  booktitle={Proc. Interspeech 2014},