ISCA Archive Eurospeech 1993
ISCA Archive Eurospeech 1993

Black box and glass box evaluation of the SUNDIAL system

Andrew Simpson, Norman M. Eraser

The field of dialogue evaluation is still in a very early stage of development. This paper surveys relevant work and outlines the approach to evaluation developed in the SUNDIAL project. This evaluates a system in terms of a battery of metrics, divided between those which treat the system as a black box and those which look inside at parts of it (as though it were a glass box). Some of these metrics require the application of subjective judgement, so they can not be fully automated. We argue that this is a reasonable price to pay for a well-rounded evaluation of a spoken dialogue system.

Keywords: Spoken dialogue systems, Evaluation, Black box metrics, Glass box metrics.