ISCA Archive Blizzard 2012
ISCA Archive Blizzard 2012

Towards Perceptual Quality Modeling of Synthesized Audiobooks – Blizzard Challenge 2012

Christoph R. Norrenbrock, Florian Hinterleitner, Ulrich Heute, Sebastian Möller

This paper reports on recent advances in the field of instrumental quality evaluation of text-to-speech (TTS) synthesis. In particular, a wide range of acoustic quality markers are analyzed concerning their quality-describing power using the audiobook data from the Blizzard Challenge 2012. Several approaches for perceptual modeling are investigated and compared with each other. The results reveal substantial correlations as high as 0.87 between subjective ratings of overall impression and their estimates.