In this paper, we construct multi-path syllable models using phonetic knowledge for initialising the parallel paths, and a data-driven solution for their re-estimation. We hypothesise that the richer topology of multi-path syllable models would be better at accounting for pronunciation variation than context-dependent phone models that can only account for the effects of left and right neighbours. We show that parallel paths that are initialised with phonetic knowledge and then re-estimated do indeed result in different trajectories in feature space. Yet, this does not result in better recognition performance. We suggest explanations for this finding, and provide the reader with important insights into the issues playing a role in pronunciation variation modelling with multi-path syllable models.