ISCA Archive Eurospeech 1993
ISCA Archive Eurospeech 1993

Learning direct acoustic-to-semantic mappings through simple recurrent networks

M. A. Castano, Enrique Vidal, Francicso Casacuberta

Recently, some approaches for Automatic Speech Understanding (ASU) have been proposed by Prieto et al, Segarra et al and Pieraccini et al. They are based on Regular Grammars or N-grams. Also, Neural Networks have been used by Stolcke in some experiments with synthetic data. However, the application of ASU to real-world problems generally supplies (acoustic-syntactic-semantic) models of considerable size. On the other hand, the acoustic and syntactic-semantic components of these models are often learned separately and are later integrated together. In this paper, we propose an approach to directly map the acoustic domain into a (large) semantic space through a Simple Recurrent Network of small size which, moreover, automatically learns all the features at the same time. As an application example, we consider a Continuous Speech Understanding (CSU) task recently considered by Prieto et al in which Grammatical Inference (GI) techniques were used.

Keywords: Automatic Speech Understanding, Language Modeling, Neural Networks, Speech Recognition.