In this paper we present the entry from CMU to Blizzard speech synthesis challenge 2018. We begin with a description of build process for our base voice. We then present the following modifications to base voice: (1) Since the data is chosen from children’s stories, we employ Rhetorical Structure Theory to obtain relationships between sentences. We specifically model the contrastive relationship between the sentences within a paragraph. (2) The original speaker attempts to use different ways of speaking depending on character and the situation in the story. To model this, we condition our acoustic model on the character and quote type information. (3) For improving the voice quality we present ‘segmental wavenet’ - a variant of the popular autoregressive framework Wavenet.