ISCA Archive ICSLP 1992
ISCA Archive ICSLP 1992

Speaker-independent keyword recognition based on SMQ/HMM

Yasuyuki Masai, Shin'ichi Tanaka, Tsuneo Nitta

This paper describes a speaker-independent keyword recognition system based on hidden Markov models (HMM's). We propose a new matrix quantization (MQ) algorithm called Statistical MQ (SMQ) that uses an orthogonalized phonetic segment codebook and a word beginning frame prediction (BFP) algorithm to achieve accurate and efficient word-spotting. The SMQ effectively incorporates pattern variations of each phonetic segment into the orthogonalized phonetic segment codebook containing about 700 phonetic segments, and transforms the input speech into a sequence of phonetic symbols. The BFP algorithm predicts a word beginning frame in which the next Viterbi alignment should be generated, using the transition frame at which the initial transition to the second state occurred in the most recent Viterbi alignment. The proposed keyword recognition system has been tested on a data set of 32 keywords, 5 auxiliary words, and 17 unknown words. The test data is compound words uttered by 6 unknown speakers and the keyword recognition accuracy was 91.1%.