ISCA Archive Eurospeech 1993
ISCA Archive Eurospeech 1993

Segmental hidden Markov models

M. J. F. Gales, Steve J. Young

The most popular and successful acoustic model for speech recognition is the Hidden Markov Model (HMM). To use HMMs for speech recognition a series of assumptions are made about the waveform, some of which are known to be poor. In particular, the 'Independence Assumption' implies that all observations are only dependent on the state that generated them, not on neighbouring observations. In this paper, a new form of acoustic model is described called the Segmental Hidden Markov Model (SHMM) in which the effect of the 'Independence Assumption' on the observation likelihood is greatly reduced. In the SHMM all observations are assumed to be independent given the state that generated them but additionally they are conditional on the mean of the segment of speech to which they belong. Re-estimation formulae are presented for the training of both single and multiple Gaussian Inter Mixture models and a recognition algorithm is described. Additionally it is shown that the standard HMM, both in the single Gaussian mixture and multiple Gaussian mixtures cases, is just a subset of the SHMM. The new model is shown to provide better recognition performance on a wider set of synthetic data than the standard HMM.

Keywords: speech recognition, HMM, segment models.