This paper presents a novel technique for building a syllable based continuous speech recognizer when unannotated transcribed train data is available. We present two different segmentation algorithms to segment the speech and the corresponding text into comparable syllable like units. A group delay based two level segmentation algorithm is proposed to extract accurate syllable units from the speech data. A rule based text segmentation algorithm is used to automatically annotate the text corresponding to the speech into syllable units. Isolated style syllable models are built using multiple frame size (MFS) and multiple frame rate (MFR) for all unique syllables by collecting examples from annotated speech. Experiments performed on Tamil language show that the recognition performance is comparable to recognizers built using manually segmented train data. These experiments suggest that system development cost can be reduced by using minimum manual effort if sentence level transcription of the speech data is available.