While the majority of traditional research in emotional speech recognition has focused on the use of a single database for assessment, it is clear that the lack of large databases has presented a significant challenge in generalizing results for the purposes of building a robust emotion classification system. Recently, work has been reported on cross-training emotional databases to examine consistency and reliability of acoustic measures in performing emotional assessment. This paper presents preliminary results on the use of glottal-based features in cross-testing (i.e., training on one database and testing on another) across 3 databases for emotion recognition of neutral, angry, happy, and sad. A comparative study is also presented using pitch-based features. The results suggest that the glottal features are more robust to the 4-class emotion classification system developed in this study and are able to perform well above chance for several of the cross-testing experiments.
Index Terms: emotion recognition, cross-databases, glottal features, pitch