ISCA Archive Eurospeech 1997
ISCA Archive Eurospeech 1997

Organizing phone models based on piecewise linear segment lattices of speech samples

Hiroaki Kojima, Kazuyo Tanaka

Aiming at robust speech recognition, we have proposed a framework for "phonological concept formation," which is the task of acquiring an efficient representation of phonemes from spoken word samples without using any transcriptions except for the lexical classification of the words. In order to implement this task, we propose the "piecewise linear segment lattice (PLSL)" model for phoneme representation. The structure of this model is a lattice of segments, each of which is represented as regression coefficients of feature vectors within the segment. In order to organize phone models, operations including division, concatenation, blocking and clustering are applied to the models. Feasibility of the method is discussed with experimental results for isolated word recognition. The recognition rate is improved by applying these operations.