ISCA Archive Eurospeech 1993
ISCA Archive Eurospeech 1993

Senones, multi-pass search, and unified stochastic modeling in sphinx-II

M. Y. Hwang, F. Alleva, X. Huang

SPHINX-II is designed for large vocabulary, speaker-independent continuous speech recognition and is based on semi-continuous hidden Markov models. In the November 1992 ARPA speech evaluation, SPHINX-II achieved the lowest error rate (5%). This paper concentrates on the special techniques that made SPHINX-II successful and different from other systems. Specifically these include senonic decision trees for acoustic modeling, the multi-pass decoder to meet the challenge for very large vocabulary recognition, and the unified stochastic engine for jointly optimizing the acoustic and language model.

Keywords: Shared-distribution models, senones, decision trees, multi-pass decoder, unified stochastic engine