ISCA Archive Interspeech 2025
ISCA Archive Interspeech 2025

Zero-Shot Learning for Acoustic Event Classification Using an Attribute Vector and Conditional GAN

Kohei Uehara, Ryoichi Takashima, Tetsuya Takiguchi

This paper presents a zero-shot acoustic event classification (ZS-AEC) method to classify acoustic events for which there is no training data. A previous study proposed a method to classify unseen events by estimating attribute information instead of acoustic event labels, where each acoustic event is associated with attribute information such as sound source material and the pitch (high or low) of the sound. However, this method often leads to the misclassification of unseen acoustic events as seen events. In this paper, we propose a generative-based ZS-AEC method to reduce the bias of prediction toward seen acoustic events. The proposed method generates latent features of unseen acoustic events from their attribute information using a conditional GAN, and a classifier is trained using the generated latent features. Experimental results demonstrated that the proposed method achieved higher classification accuracies than the conventional method based on attribute information.