ISCA Archive Interspeech 2021
ISCA Archive Interspeech 2021

Comparing Speech Enhancement Techniques for Voice Adaptation-Based Speech Synthesis

Nicholas Eng, C.T. Justine Hui, Yusuke Hioka, Catherine I. Watson

This study investigates the use of speech enhancement techniques in creating text-to-speech voices with degraded or noisy speech. A number of synthetic voices were created using speech that was first degraded by different noise types at various signal-to-noise ratios (SNRs), then enhanced through four speech enhancement algorithms: Subspace, Wiener filter, SEGAN and a DNN-based method. Subjective listening tests show that the quality of the synthetic voices produced by subspace and the DNN-based method enhanced speech outperforms the quality of the voices created using Wiener filter or SEGAN enhanced speech at low SNRs, and speech enhanced by the subspace method results in higher quality synthetic speech at higher SNRs.