ISCA Archive SSW 2007
ISCA Archive SSW 2007

An evaluation of many-to-one voice conversion algorithms with pre-stored speaker data sets

Daisuke Tani, Yamato Ohtani, Tomoki Toda, Hiroshi Saruwatari, Kiyohiro Shikano

This paper describes an evaluation of many-to-one voice conversion (VC) algorithms converting an arbitrary speaker’s voice into a particular target speaker’s voice. These algorithms effectively generate a conversion model for a new source speaker using multiple parallel data sets of many pre-stored source speakers and the single target speaker. We conducted experimental evaluations for demonstrating the conversion performance of each of the many-to-one VC algorithms, including not only the conventional algorithms based on a speaker independent GMM and on eigenvoice conversion (EVC), but also new algorithms based on speaker selection and on EVC with speaker adaptive training (SAT). As a result, it is shown that an adaptation process of the conversion model improves significantly conversion performance, and the algorithm based on speaker selection works well even when using a very limited amount of adaptation data.