ISCA Archive Interspeech 2023
ISCA Archive Interspeech 2023

Episodic Memory For Domain-Adaptable, Robust Speech Emotion Recognition

James Tavernor, Matthew Perez, Emily Mower Provost

Emotion conveys abundant information that can improve the user experience of various automated systems, in addition to communicating information important for managing well-being. Human speech conveys emotion, but speech emotion recognition models do not perform well in unseen environments. This limits the ubiquitous use of speech emotion recognition models. In this paper, we investigate how a model can be adapted to unseen environments without forgetting previously learned environments. We show that memory-based methods maintain performance on previously seen environments while still being able to adapt to new environments. These methods enable continual training of speech emotion recognition models following deployment while retaining previous knowledge, working towards a more general, adaptable, acoustic model.