ISCA Archive Eurospeech 2003
ISCA Archive Eurospeech 2003

Discriminative methods for improving named entity extraction on speech data

James Horlock, Simon King

In this paper we present a method of discriminatively training language models for spoken language understanding; we show improvements in named entity F-scores on speech data using these improved language models. A comparison between theoretical probabilities associated with manual markup and the actual probabilities of output markup is used to identify probabilities requiring adjustment. We present results which support our hypothesis that improvements in F-scores are possible by using either previously used training data or held out development data to improve discrimination amongst a set of N-gram language models.