ISCA Archive Blizzard 2012
ISCA Archive Blizzard 2012

The NTUT Blizzard Challenge 2012 Entry

Yuan-Fu Liao, Chia-Chi Lin, Jiun-Yan Pan

This paper describes our HMM-based speech synthesis system (HTS) [1] submitted to Blizzard Challenge 2012 [2]. This is our first English TTS and also our first audiobook application. In this system, not only linguistic but also semantic features beyond sentence level are extracted including the (1) semantic topics and (2) punctuation marks (PMs) of current and surrounding sentences and (3) number and forward and backward positions of sentences in a paragraph. Especially, Latent Dirichlet Allocation (LDA) [3]-based approach was adopted to analyze the topic of an input sentence and applied to both (1) decision tree-based clustering and (2) adjust the durations of inter-sentence breaks.