ISCA Archive Interspeech 2007
ISCA Archive Interspeech 2007

Selecting on-topic sentences from natural language corpora

Michael Levit, Elizabeth Boschee, Marjorie Freedman

We describe a system that examines input sentences with respect to arbitrary topics formulated as natural language expressions. It extracts predicate-argument structures from text intervals and links them into semantically organized proposition trees. By instantiating trees constructed for topic descriptions in trees representing input sentences or parts thereof, we are able to assess degree of "topicality" for each sentence. The presented strategy was used in the BBN distillation system for the GALE Year 1 evaluation and achieved outstanding results compared to other systems and human participants.