ISCA Archive Eurospeech 1993
ISCA Archive Eurospeech 1993

Subband array processing for speech enhancement

Kristian Kroschel, Keld Lange

Classical array processing systems for speech enhancement include three components. The first one is used for delay compensation of the speech signal in the different microphone channels, the second component is based on spectral subtraction or Wiener filtering to enhance the signal-to-noise-ratio, and the third component has to compensate residual noise components like musical tones. In this paper a new system based on this principal approach is presented. Instead of pure delay compensation an equalizer is used which compensates the delay of the speech signals and the differences of the transfer function of the different microphone channels. Depending on their position the microphones are related to subsections of the speech spectrum to avoid a dynamic delay compensation caused by the movement of the head of the speaker. A third improvement over the classical approach is given by the fact that instead of a classical FFT algorithm the Short-Time Fourier Transform (STFT) proposed by Portnoff is used which has been implemented using the FFT. Since the speech signal and the noise are instationary processes this transform is favourable. With this configuration the post processing is a simple addition of the partial results because the musical tones have been significantly removed by the other components of the system. It has been shown that the method presented in this paper can be realized in real time using an Intel i860 processor.

Keywords: Array processing, noise reduction, speech enhancement, Short-Time Fourier Transform.