ISCA Archive ICSLP 2002
ISCA Archive ICSLP 2002

Combining acoustic and language information for emotion recognition

Chul Min Lee, Shrikanth S. Narayanan, Roberto Pieraccini

This paper reports on emotion recognition using both acoustic and language information in spoken utterances. So far, most previous efforts have focused on emotion recognition using acoustic correlates although it is well known that language information also conveys emotions. For capturing emotional information at the language level, we introduce the information-theoretic notion of ‘emotional salience’. For acoustic information, linear discriminant classifiers and k-nearest neighborhood classifiers were used in the emotion classi- fication. The combination of acoustic and linguistic information is posed as a data fusion problem to obtain the combined decision. Results using spoken dialog data obtained from a telephone-based human-machine interaction application show that combining acoustic and language information improves negative emotion classification by 45.7% (linear discriminant classifier used for acoustic information) and 32.9%, respectively, over using only acoustic and language information.