The quality of manual annotations of speech corpora depends on the ability of human annotators to cope with phonetic and prosodic coding schemas such as SAMPA and ToBI. It has been proposed widely that an acceptable amount of reliability among and within individual annotators is impossible to achieve. In this paper, we present an extensive evaluation of annotator reliability in a multilevel phonetically annotated speech corpus, using two methods for measuring annotator reliability. The results show that manual annotations can be very reliable, but that reliability is correlated with the complexity of the coding schema.