ISCA Archive SSW 2023
ISCA Archive SSW 2023

SarcasticSpeech: Speech Synthesis for Sarcasm in Low-Resource Scenarios

Zhu Li, Xiyuan Gao, Shekhar Nayak, Matt Coler

Sarcastic speech synthesis, the ability to generate speech that conveys sarcasm, can have several significant implications in various contexts, such as entertainment and better human-computer interaction. This study presents a first attempt to apply transfer learning techniques from a diverse speech style dataset to the challenging domain of sarcastic speech synthesis. The limited availability of specific sarcastic speech data poses significant challenges in capturing the expressive nature of sarcasm. By leveraging transfer learning, a pre-trained model is fine-tuned using a dataset encompassing various speech styles, including sarcastic speech. The synthesized sound contains some robotic elements, indicating moderate performance improvements in sarcastic speech synthesis through transfer learning. Future work will explore the application of multi-modal approaches to improve sarcastic speech synthesis and further enhance the expressiveness and naturalness of generated sarcastic speech.