ISCA Archive Interspeech 2014
ISCA Archive Interspeech 2014

Using hidden Markov models for speech enhancement

Akihiro Kato, Ben Milner

This work presents an approach to speech enhancement that operates using a speech production model to reconstruct a clean speech signal from a set of speech parameters that are estimated from the noisy speech. The motivation is to remove the distortion and residual and musical noises that are associated with conventional filtering-based methods of speech enhancement. The STRAIGHT vocoder forms the model for speech reconstruction and requires a time-frequency surface and fundamental frequency information. Hidden Markov model synthesis is used to create an estimate of the time-frequency surface and this is combined with the noisy surface using a perceptually motivated signal-to-noise ratio weighting. Experimental results compare the proposed reconstruction-based method to conventional filtering-based approaches of speech enhancement.