ISCA Archive Eurospeech 1989
ISCA Archive Eurospeech 1989

Recognition of continuous speech using neural nets and expert system

Anders Baekgaard, Paul Dalsgaard

A system for recognising continuously spoken sentences is presented. The system has a vocabulary of approx. 35 words and a grammar specifying a few thousand sentences. The system operates in three stages. In the first stage, cepstrum vectors are computed in real time and used as input to a self organised neural network. The output of the network is mapped to a continuous valued acoustic phonetic distinctive feature vector for each frame of the speech signal. These vectors are in the second stage processed by a multi layer perceptron which is trained to estimate segment boundaries. The output from this stage is a discrete valued acoustic phonetic distinctive feature for each segment of the speech signal (allophones). The third stage contains an expert system, which processes the allophones using a lexicon and a parsing system.