ISCA Archive Interspeech 2015
ISCA Archive Interspeech 2015

Complex tensor factorization in modulation frequency domain for single-channel speech enhancement

Shogo Masaya, Masashi Unoki

This paper proposes a novel method of speech enhancement using tensor factorization, which is extended from complex non-negative matrix factorization (CMF), in the modulation frequency domain. Non-negative matrix factorization (NMF) has attracted a great deal of attention as a recent approach to speech enhancement for its ease of feature detection in the acoustic frequency domain. However, previous studies have suggested that spectral processing like spectral subtraction in the modulation frequency domain has been an effective scheme for speech enhancement. The use of not only the amplitude information but also the phase information is required in the modulation frequency domain to utilize more information on speech. Thus, we present new tensor factorization on the complex spectrum in the modulation frequency domain for single-channel speech enhancement. The amplitude and phase spectrum in the acoustic frequency domain can be estimated by using the factorized complex spectra in the modulation frequency domain. Numerical experiments were carried out under several noisy conditions to evaluate the effectiveness of the proposed method. The signal to error ratio and signal to noise ratio loss were used as objective measures. The results revealed that the proposed method outperformed the existing methods of speech enhancement based on NMF and CMF.