In this paper we present an experimental system for speaker independent isolated word recognition that is based on vector quantization and multilayer perceptron networks. Taking advantages of both supervised and unsupervised learning, we explored the system performance for generalizing with limited training data. Trained with code words, two groups of networks are built to classify static and dynamic feature vectors respectively, and within each group several networks with the same structure classify vector quantized sequences arranged in the order of minimal distortions. Each network makes its own weighted contribution to the final decision. Because these networks can be trained separately and each network is relatively small, the capacity of the whole network can be effectively extended and the performance of the system can be improved without an increased computational effort.
Keywords: speaker independent speech recognition, isolated word recognition, neural networks, speech recognition.