ISCA Archive Interspeech 2014
ISCA Archive Interspeech 2014

Ensemble modeling of denoising autoencoder for speech spectrum restoration

Xugang Lu, Yu Tsao, Shigeki Matsuda, Chiori Hori

Denoising autoencoder (DAE) is effective in restoring clean speech from noisy observations. In addition, it is easy to be stacked to a deep denoising autoencoder (DDAE) architecture to further improve the performance. In most studies, it is supposed that the DAE or DDAE can learn any complex transform functions to approximate the transform relation between noisy and clean speech. However, for large variations of speech patterns and noisy environments, the learned model is lack of focus on local transformations. In this study, we propose an ensemble modeling of DAE to learn both the global and local transform functions. In the ensemble modeling, local transform functions are learned by several DAEs using data sets obtained from unsupervised data clustering and partition. The final transform function used for speech restoration is a combination of all the learned local transform functions. Speech denoising experiments were carried out to examine the performance of the proposed method. Experimental results showed that the proposed ensemble DAE model provided superior restoration accuracy than traditional DAE models.

doi: 10.21437/Interspeech.2014-222

Cite as: Lu, X., Tsao, Y., Matsuda, S., Hori, C. (2014) Ensemble modeling of denoising autoencoder for speech spectrum restoration. Proc. Interspeech 2014, 885-889, doi: 10.21437/Interspeech.2014-222

  author={Xugang Lu and Yu Tsao and Shigeki Matsuda and Chiori Hori},
  title={{Ensemble modeling of denoising autoencoder for speech spectrum restoration}},
  booktitle={Proc. Interspeech 2014},