ISCA Archive Interspeech 2022
ISCA Archive Interspeech 2022

Coupled Discriminant Subspace Alignment for Cross-database Speech Emotion Recognition

Shaokai Li, Peng Song, Keke Zhao, Wenjing Zhang, Wenming Zheng

Speech emotion recognition (SER) is a long-standing important research problem in speech signal processing. In practice, the training and test data are often collected in different scenarios, e.g., different languages, different collecting devices, which would severely degrade the recognition performance. To tackle this problem, in this letter, we propose a novel transfer learning algorithm, named coupled discriminant subspace alignment (CDSA), for cross-database SER. In CDSA, we first conduct linear discriminant analysis (LDA) in source and target databases, respectively. Meanwhile, we learn a latent common subspace, where the target samples are represented by the combination of source samples. Furthermore, we align the projection subspace of source and target databases to make the model more robust. Extensive experiments are carried out on four benchmark databases, and the results demonstrate the effectiveness of the proposed method.