This paper describes a speaker-independent HMM continuous speech recognizer and its evaluation with three large continuous speech databases recorded at several German universities and industrial sites. These databases differ with regard to recording conditions, speaking rate, microphone location, amount of training and test data, and application dependency of the training material. The recognition system is based on mel-cepstral coefficients, a linear discriminant transform of cepstral data, a soft-decision vector quantizer with gaussian distributions, and semi-continuous Hidden Markov Models (SCHMM) of context-dependent subword units.
Keywords: Continuous speech recognition, linear discriminant analysis, SCHMM, speech databases, word graphs.