ISCA Archive ECST 1987
ISCA Archive ECST 1987

A proposal for a speaker independent isolated word (SIIW) recogniser of a limited vocabulary

M. H. Savoji

A spoken word is considered as an input pattern in its entirety and no time warping of frame based features is envisaged. The Hadamard-Walsh transform (HWT) is used to create the feature space which is binary mapped by thresholding. The threshold levels are determined during the training session. The feature selection is carried out in two steps. First, a minimum number of features is chosen to represent each class centre with a unique binary code. This minimum subset is then expanded by introducing redundant bits, resulting from the inclusion of more transform coefficients, by chain coding the original binary codes. This expansion increases the Hamming distance between classes. The classification is based on the shortest Hamming distance of the input pattern to equally distant centroids. The classification errors are detected and corrected in a manner similar to error detection/correction used in chain coding.