ISCA Archive Interspeech 2025
ISCA Archive Interspeech 2025

Enabling the replicability of speech synthesis perceptual evaluations

Sébastien Le Maguer, Gwénolé Lecorvé, Damien Lolive, Naomi Harte, Juraj Šimko

How speech synthesis is evaluated is nowadays questioned. Not only have conventional listening tests as a whole been proven a poor match for modern synthesis, but more fundamentally, important information (e.g., the question asked to the listener) is frequently missing in the report of the outcome of the evaluation despite the impact on the interpretation of the test results. This can lead to uncertainty about the validity of these evaluations. To address this issue, we propose standardising the structure of any evaluation report. To facilitate this standardisation, our contribution is twofold: an open-source subjective evaluation platform; and a set of reporting guidelines. The platform is designed to enable the development of easily shareable evaluation recipes. The set of guidelines complements the platform to support researchers in reporting their evaluation choices and analysis in more detail while relying on the recipe to describe the actual evaluation process.