In this paper we report on a sequence of experiments designed to explore the use of analysis-by-synthesis methods for speech recognition and speech analysis in general. An intermediate representation of the speech signal is formulated in terms of speech-synthesis-like parameters.
Using an multi-layer perceptron as a common classifier, we have performed several vowel classification experiments based on these parameters. The results of the experiments indicate that we are able to obtain the same classification performance as a more traditional spectral representation using nearly an order of magnitude fewer dimensions.
We have also developed a speaker normalization procedure that improves classification rate compared to the one we obtain with a simple male/female normalization.
In our last set of experiments we have studied the influence of the context on the classification result. The best classification results in our experiments were achieved by a combination of default formants and labels specifying the context together with speaker normalization of the automatically measured synthesis parameters.