We address the problem of labeling phonemes in a large database of spoken sentences using a different, already labeled utterance of each sentence. In labeling an utterance, a time alignment between the utterance and a reference utterance is required. This alignment compares parameter vectors from different speakers and is therefore subject to an acoustic mismatch. The accuracy of conventional alignment methods which do not deal with the cross-speaker mismatch is thus limited. We improve speech database labeling by iterating two steps: alignment against the labeled speech and transformation of parameter vectors from test to reference. We propose to use a linear scalar transform for each component of a parameter vector. Under our experimental conditions, the results showed that the misalignment between the reference and test utterances was reduced, and the improved alignment improved the recognition rate. A recognition system trained using the phonetic labels provided by iteration of the alignment process gave a 50% reduction in sentence recognition error rate.