This paper presents a new method for the statistical learning of the correspondence between spectral parameters measured from two different speakers uttering the same text. This method is based on the use of a gaussian mixture model of the speaker's spectral parameters. It is shown to be more efficient and robust than previously known techniques based on the use of vector quantization. The results obtained on large speech database demonstrate effective high-quality transformations of the voice characteristics.