Even though the accuracy of predictions made by speech emotion recognition (SER) systems is increasing in precision, little is known about the confidence of the predictions. To shed some light on this, we propose a confidence measure for SER systems based on semi-supervised learning. During the semi-supervised learning procedure, five frequently used databases with manually created confidence labels are implemented to train classifiers. When the SER system predicts the label for an unknown test utterance, these classifiers serve as a reliability estimator for the utterance and output a series of confidence ratios that are combined into a single confidence measure. Our experimental results impressively show that the proposed confidence measure is effective in indicating how much we can trust the predicted emotion.
Index Terms: speech emotion recognition, confidence measure, semi-supervised learning, cross-corpus