ISCA Archive Interspeech 2025
ISCA Archive Interspeech 2025

Towards Few-Shot Training-Free Anomaly Sound Detection

Ho-Hsiang Wu, Wei-Cheng Lin, Abinaya Kumar, Luca Bondi, Shabnam Ghaffarzadegan, Juan Pablo Bello

Anomaly sound detection is an important audio task, with various applications in industrial monitoring, healthcare, security and surveillance, and other domains. Existing methods are often designed with various assumptions for specific types of anomalies and are not suitable when applied to the challenging real-world scenarios, such as domain shifts and data scarcity issues. In this work, we propose a few-shot training-free method, leveraging pre-trained audio models to extract patch-based spatial-temporal representations for few-shot anomaly detection and segmentation. We show that our proposed approach is flexible and applicable even under the extreme low-shot regimes, and can at times outperform models trained with the full datasets. Furthermore, our method is more robust in tackling domain shifts, with the need of only few-shot data points to quickly adapt to various conditions, therefore more suitable for deployment in real-world applications.