ISCA Archive Eurospeech 1993
ISCA Archive Eurospeech 1993

Integration of a prosodic component in an automatic speech recognition system

P. Langlais, Henri Meloni

We present here the integration of a prosodic component in an automatic speech recognition system. We have decided to restrict our sphere of investigation to corpus of steady grammatical structures (sentences) utterered in a straightforward manner. A bottom-up identification system of prosodic labels organized into a hierarchy allows to point out in a sentence the occurences of some prosodic phenomena (two-way emergence of a vowel fundamental frequency, lengthening of its duration ...). Then, a statistical analysis module quantizes - for a given corpus - the correlations between linguistic units and particular labels configurations. The rules we achieve are just as well used for bottom-up identification of component limits in a sentence as for top-down verification of lexical hypothesis. The recognition process splits up into several stages: a bottom-up acoustic and phonetic decoding allows to a lexical access module to output, for each detected vocalic area, a valued cohort of potential words. A first bottom-up extractor eliminates from these cohorts, candidates which cannot be superimposed on the measured prosodic factors and next suggest a filling in every cohort adding to the acoustic score of each word its micro-prosodic mark. Then, the grammatically correct phonemic strings, by means of proposition of valued hypotheses on the utterance divided in intonative groups. A mark representing the adequacy of the prosodic parameters measured with those found in the corpus, is allocated to each candidate sentence.