In this paper, we introduce two reformulated versions of the standard EM algorithm, namely Successive Split EM and Split and Merge EM, to relax the problem of initialization dependence in datadriven Speech Trajectory Clustering. These two algorithms allow us to prevent the EM procedure in Trajectory Clustering from ending in a local maximum of the likelihood surface. Thus, the new methods will generate more coherent trajectory clusters. We applied these two methods for developing multiple parallel HMMs for a continuous digit recognition task. We compared the performance obtained with the proposed methods to the recognition performance obtained with knowledge-based contextdependent Head-Body-Tail models. The results showed that both datadriven approaches significantly outperform the knowledge-based approach. In addition, in most cases the model based on Split and Merge EM is better than the model based on Successive Split EM.