ISCA Archive Eurospeech 1993
ISCA Archive Eurospeech 1993

A methodology for evaluating human-machine spoken language interaction

Cristina Delogu, Andrea Di Carlo, Ciro Sementina, Silvia Stecconi

This paper reports an experiment for evaluating human-machine spoken language interaction. A phone-directory with information on Fondazione Bordoni employees were accessed by 54 users. The system was composed of a telephone interface; a simulated speech recognition system (Wizard of Oz); a database with the information on employees; a natural language processing module; a response generator; a text-to-speech synthesizer. Three different levels of evaluation have been identified: an overall evaluation of the user-system interaction, an user's performance evaluation, and a system's evaluation. Quantitative results and qualitative observations have been reported. Different modalities of generation (natural vs automatic) and test repetition (Testl vs Test2) were two dimensions by means of which the three kinds of evaluations have been performed.

Keywords: Spoken Language Systems evaluation; dialogue understanding; Wizard of Oz simulation