ISCA Archive SAPA 2008
ISCA Archive SAPA 2008

Discriminative word-spotting using ordered spectro-temporal patch features

Tony Ezzat, Tomaso Poggio

We present a novel architecture for word-spotting which is trained from a small number of examples to classify an utterance as containing a target keyword or not. The word-spotting architecture relies on a novel feature set consisting of a set of ordered spectro-temporal patches which are extracted from the exemplar mel-spectra of target keywords. A local pooling operation across frequency and time is introduced which endows the extracted patch features with the flexibility to match novel unseen keywords. Finally, we describe how to train a support vector machine classifier to separate between keyword and nonkeyword patch feature responses. We present preliminary results indicating that our word-spotting architecture achieves a detection rate of 70-95% with false positive rates of about 0.25-2 false positives per minute.