In [I], we described a word-spotting technique consisting of a continuous-speech decoder and a post-decoder processor. The continuous-speech decoder performs a modified Viterbi search through a network of connected hidden Markov models (EMM) to produce hypothesized keyword segments. The post-processor then performs a discrete-word decoding to label each hypothesized keyword segment, generates corresponding a-posteriori measures, and decides the recognition outcome. In this paper, we present an approach for estimating post processor parameters that produce recognition errors near their empirical lower bounds. To develop and evaluate the performance of the proposed scheme we used the Road Rally speech corpus. The results show that at similar rejection levels the proposed approach reduces the substitution and false alarm rates reported in [I] by up to 35%.
Keywords: Word Spotting, Post Processor, A-posteriori Measure, Decision Region, Operation Point