This paper describes a hidden Markov model (HMM) based speech synthesis system developed for the Blizzard Challenge 2011. In the Blizzard Challenge 2011, we focused on the training algorithm for HMM-based speech synthesis systems. To alleviate the local maxima problems in the maximum likelihood estimation, we apply the deterministic annealing expectation maximization (DAEM) algorithm for training HMMs. By using the DAEM algorithm, the reliable acoustic model parameters can be estimated. In addition, we apply stepwise model selection to the model training. The decision tree based context clustering is used as model selection in HMM-based speech synthesis. By using the stepwise model selection method, decision trees are gradually changed from small trees into large trees for estimating reliable acoustic models. Subjective evaluation results show that the system synthesized the high intelligible speech.