Natural and synthetic voice differ from various points of view, as shown by the results of many experiments. This difference can be due to possible differences in the acoustic-phonetic structure of the two signals. In order to investigate this hypothesis, we run a consonant confusion test for 19 Italian consonants produced by a natural voice with noise (3 S/N ratios) and 6 TTS systems presented through good and telephone channels. The results showed that the distributions of consonant confutions for natural and synthetic speech (both formant-based and diphone-based synthesis) were often quite different, suggesting some contraddiction in the acoustic cues and in the coarticulation model of the synthetic signals.