The measure of the goodness, or cost, of concatenating synthesis units plays an important role in concatenative speech synthesis. In this paper, we present a probabilistic approach to concatenation modeling in which the goodness of concatenation is represented as the conditional probability of observing the spectral shape of a unit given the previous unit and the current phonetic context. This conditional probability is modeled by a conditional Gaussian density whose mean vector has a form of linear transform of the past spectral shape. A phonetic decision-tree based parameter tying is performed to achieve a robust training that balances between model complexity and the amount of training data available. The concatenation models are implemented in a corpus-based speech synthesizer trained with a CMU Arctic database and the effectiveness of the proposed method was confirmed by a subjective listening test.