ISCA Archive Eurospeech 1991
ISCA Archive Eurospeech 1991

Speaker independent word recognition using HMMs with an orthogonalized phonetic segment codebook

Tsuneo Nitta, Jun'ichi Iwasaki, Hiroshi Matsu'ura

The large matrix quantization (MQ) distortion becomes a problem as a spectrum-time pattern in MQ have many dimensions and wide variation. In this paper, we introduce a multiple phonological unit called the phonetic segment for a unit of MQ and apply a statistical matrix quantization (SMQ). The SMQ effectively incorporates pattern variations of each phonetic segment into an orthogonalized phonetic segment codebook. We also propose a simple SMQ-HMM training algorithm called an Equally Counted K-best Learning in which each phonetic event observed within the best K is equally counted in a model and output probabilities are smoothed without fuzzy rule. The proposed method has been tested on a 100-word vocabulary data set uttered by 10 unknown speakers, using a real time recognition system, and has achieved the high performance of 96. 0%.