The usage of multiple Hidden Markov Models (HMMs) to prepare a Czech acoustic unit inventory and speech synthesis based on this inventory are presented in this paper. Triphone HMMs are trained on the basis of the speech corpus spoken by a single speaker. The states of triphone HMMs are automatically clustered down using binary decision trees. The clustered states are then used to automatically segment the speech corpus and to create a speech segment database. The acoustic unit inventory constructed in this way is assumed to enable more precise context modeling than was previously possible. Concatenation-based speech synthesizer can be designed on the basis of the speech segment database. Several speech synthesis techniques are discussed for this purpose. In the end, a Czech text-to-speech (TTS) system is presented.