ISCA Archive Interspeech 2023
ISCA Archive Interspeech 2023

Utility-Preserving Privacy-Enabled Speech Embeddings for Emotion Detection

Chandrashekhar Lavania, Sanjiv Das, Xin Huang, Kyu J. Han

Audio privacy has been undertaken using adversarial task training or adversarial models based on GANs, where the models also suppress scoring of other attributes (e.g., emotion, etc.), but embeddings still retain enough information to bypass speaker privacy. We use methods for feature importance from the explainability literature to modify embeddings from adversarial task training, providing a simple and accurate approach to generating embeddings for preserving speaker privacy while not attenuating utility for related tasks (e.g., emotion recognition). This enables better adherence with privacy regulations around biometrics and voiceprints, while retaining the usefulness of audio representation learning.