ISCA Archive Interspeech 2011
ISCA Archive Interspeech 2011

Crowdsourcing preference tests, and how to detect cheating

Sabine Buchholz, Javier Latorre

We describe an approach to crowdsource the evaluation of TTS systems by preference tests and report on lessons learnt from running 127 real-life crowdsourced tests. We show that at least one type of cheating becomes more prevalent over time if left unchecked and develop metrics to exclude cheaters. We demonstrate that their exclusion improves test outcomes.