Naturalness in human speech is dependent on a number of factors and the extent to which a text-to-speech synthesis system can account for these factors in its model will be a measure of its success in the marketplace. As well as the obvious factors of rhythm and intonation there is the more difficult question of modelling the variability in human speech. This paper discusses how SPRUCE [1], a high-level text-to-speech synthesis system, incorporates several different types of variability.