ISCA Archive ICSLP 1992
ISCA Archive ICSLP 1992

LVQ-based speech recognition with high-dimensional context vectors

Jyri Mantysalo, Kari Torkkola, Teuvo Kohonen

In this paper we have applied the Learning Vector Quantization methods, including the latest developments [1, 3, 2, 4], to the task of Finnish speaker-dependent speech recognition. The main objective was to study the effect of radically increasing the dimensionality of the context vectors. The high-dimensional feature vectors in our work represent the whole phoneme and they are formed by both averaging and concatenating shorttime feature vectors within a time domain window. Excellent results are achieved in separate phoneme classification of Finnish speech. Moreover, we also show how this method can be applied in combined labeling and segmentation of continuos speech. In this task we use an additional segmentation LVQ-codebook, and combine the information using HMMs.