We study the problem of phonetic modeling for continuous Mandarin speech recognition by providing a systematic performance comparison for systems based on following primitive speech units: syllable, demi-syllable (Initials and Finals), context-independent phones, left-or-right context-dependent phones (diphones), and left-and-right context-dependent phones (triphones). In our speaker-dependent continuous speech recognition experiments, a generalized triphone system has achieved the best performance among all. Our best system contrasts most other Mandarin speech recognition systems which have been based on demi-syllable units.