ISCA Archive Interspeech 2023
ISCA Archive Interspeech 2023

Self-Paced Pattern Augmentation for Spoken Term Detection in Zero-Resource

Sudhakar P, Sreenivasa K. Rao, Pabitra Mitra

The spoken term detection task is challenging when a large volume of spoken content is generated without annotation. The pattern discovery approach aims to overcome the challenges by capturing the pattern similarities directly from the representation of the speech signal. A challenge to the pattern discovery task is handling the variabilities in natural speech. In the proposed approach, we aim to overcome the pattern variability challenges in the spoken term similarity region in three stages. At first, the pattern similarities between two spoken terms were captured using our heuristic search, and the pattern variabilities in the similarity region were observed. In the second stage, the observed pattern variabilities were augmented to the Siamese network to learn the relationship. Finally, the learned network is used to identify the matches between spoken query and document. Based on the experimental studies, it is observed that the proposed approach reduces the false alarms by 17.7% and improves the spoken term detection accuracy by 7.1% against the Microsoft Low-Resource Language corpus.