Recently, some approaches for Automatic Speech Understanding (ASU) have been proposed by Prieto et al, Segarra et al and Pieraccini et al. They are based on Regular Grammars or N-grams. Also, Neural Networks have been used by Stolcke in some experiments with synthetic data. However, the application of ASU to real-world problems generally supplies (acoustic-syntactic-semantic) models of considerable size. On the other hand, the acoustic and syntactic-semantic components of these models are often learned separately and are later integrated together. In this paper, we propose an approach to directly map the acoustic domain into a (large) semantic space through a Simple Recurrent Network of small size which, moreover, automatically learns all the features at the same time. As an application example, we consider a Continuous Speech Understanding (CSU) task recently considered by Prieto et al in which Grammatical Inference (GI) techniques were used.
Keywords: Automatic Speech Understanding, Language Modeling, Neural Networks, Speech Recognition.