The exploration of uncanny valley effects (UVE) -a distaste for entities that appear almost, but not quite, human -has been a productive topic of research in human-robot interaction. Meanwhile, realistic text-to-speech (TTS) voices are increasingly encountered in various settings. In this work, we aim to describe the relationship between the perceived human-likeness and pleasantness of TTS voices and seek evidence of auditory UVE in listeners’ evaluations. In an online between-subjects experiment, listeners rated an array of manipulated TTS voices, trained using a single speaker’s data. The evidence obtained is compatible with a slight plateau in a generally positive correlation between realism and approval. All the TTS voices used received ratings of below 50% on average for ‘human-likeness’, and therefore conclusions about UVE, i.e. negative reactions to voices perceived as very human-like, cannot be drawn from these data. Our results suggest that, although a correlation exists, high realism may not be necessary for relatively high approval; on average, voices with decreased pitch variation were rated about twice as highly for being ‘pleasant’ and ‘friendly’ as they were ‘like a human’. The relationship between pitch variation and perceived realism is examined and identified as a direction for further research.