ISCA Archive Interspeech 2014
ISCA Archive Interspeech 2014

Co-channel speech detection via spectral analysis of frequency modulated sub-bands

Navid Shokouhi, Seyed Omid Sadjadi, John H. L. Hansen

Overlapped-speech is known to degrade performance in automatic speech systems. In this study, a sub-band speech analysis technique is proposed to detect overlapped-speech segments in single-channel multi-speaker scenarios (i.e., co-channel speech). Sub-band signals are obtained by decomposing the input speech using a gammatone filterbank. Filterbank outputs are then used to modulate the frequency argument of a sinusoidal carrier. We show that the spectra of these frequency-modulated signals, namely Gammatone Sub-band Frequency Modulation (GSFM) features, are more disperse in overlapped-speech segments compared to single-speaker regions. We quantify the dispersion rate to obtain a measure for the amount of overlapped speech in a given speech segment. Overlap detection experiments are conducted using the speech separation challenge corpus and GSFM features are compared to commonly used overlap detection features. Detection errors are reduced by a relative 50% across different signal-to-interference values ranging from 0 to 9dB.

doi: 10.21437/Interspeech.2014-517

Cite as: Shokouhi, N., Sadjadi, S.O., Hansen, J.H.L. (2014) Co-channel speech detection via spectral analysis of frequency modulated sub-bands. Proc. Interspeech 2014, 2380-2384, doi: 10.21437/Interspeech.2014-517

  author={Navid Shokouhi and Seyed Omid Sadjadi and John H. L. Hansen},
  title={{Co-channel speech detection via spectral analysis of frequency modulated sub-bands}},
  booktitle={Proc. Interspeech 2014},