ISCA Archive Eurospeech 1997
ISCA Archive Eurospeech 1997

Modeling linguistic segment and turn boundaries for n-best rescoring of spontaneous speech

Andreas Stolcke

Language modeling, especially for spontaneous speech, often suffers from a mismatch of utterance segmentations between training and test conditions. In particular, training often uses linguistically-based segments, whereas testing occurs on acoustically determined segments, resulting in degraded performance. We present an N-best rescoring algorithm that removes the effect of segmentation mismatch. Furthermore, we show that explicit language modeling of hidden linguistic segment boundaries is improved by including turn-boundary events in the model.