ISCA Archive Interspeech 2023
ISCA Archive Interspeech 2023

Joint Instance Reconstruction and Feature Subspace Alignment for Cross-Domain Speech Emotion Recognition

Keke Zhao, Peng Song, Shaokai Li, Wenming Zheng

Speech emotion recognition is a popular research branch of speech signal processing. Many previous studies have proven that the generalization ability of the emotion recognition model across domains can be improved by using transfer learning methods. To solve the cross-domain speech emotion recognition problem, this paper proposes a novel transfer learning method, which simultaneously performs the instance reconstruction and subspace alignment. Firstly, we conduct the instance transferring based on coupled projection, which utilizes a weighting reconstruction strategy to exploit the intrinsic information of cross-domain samples and improve the contribution of essential features through an adaptive weighting matrix. Then, we conduct the feature transferring through a novel co-regularized term, which can make the source and target subspace be well aligned. Finally, extensive experiments indicate that our method is superior to several state-of-the-art methods.