The paper presents the labelling system based on EMM modelling applied with a limited set of phonemic classes. The units of description has been chosen in such way that each class of phones is relatively homogeneous in its mode of articulation, irrespective of their context The feature vector consists of 14 cepstral and 14 delta cepstral coefficients. The labelling errors were analyzed for two working moods: a) when the correct string of labels is known to the system, b) when not. In the former case the reference transcription resulting from text-to phoneme conversion is introduced under network sequence to Viterbi algorithm and the only possible errors are in segments boundaries. In the latter, all typical labelling errors are discussed.