ISCA Archive Eurospeech 1995
ISCA Archive Eurospeech 1995

Robust parsing of n-best speech hypothesis lists using a general grammar-based language model

Manny Rayner, Peter Wyard

We describe a series of experiments designed to investigate the feasibility of using a general linguistically motivated grammar of English to improve the language model of a speech recognizer. A largely automatic corpus-based method was used to convert the general grammar into a specialised version tuned to the domain. This was then used to parse N-best speech hypothesis lists produced by a recognizer, using an algorithm which optionally allowed deletions or substitutions at the beginings and ends of utterances. Competing robust analyses were scored using a weighted combination of several corpus-based preference functions. The sentence accuracy of the recognizer improved from 34.5% to 39%, on a metric which regarded close variants of the reference sentence as successes.