ISCA Archive SSW 1998
ISCA Archive SSW 1998

Comparison of subjective evaluation and an objective evaluation metric for prosody in text-to-speech synthesis

Daniel Hirst, Albert Rilliard, Véronique Aubergé

An experimental technique is described for eliciting a subjective evaluation of the prosody of synthetic speech by untrained listeners. The technique makes use of a graphic display time-aligned with the speech signal. Subjects are asked to indicate which parts of a recording are unsatisfactory by clicking on a computer screen with a mouse. The technique was applied to two TTS systems for French. Results obtained using this technique are to be compared with those obtained using an objective evaluation metric for prosodic characteristics, comparing the synthetic versions with a number of different readings by human speakers.