ISCA Archive Interspeech 2014
ISCA Archive Interspeech 2014

Speech detection in transient noises

G. Aneeja, B. Yegnanarayana

Voice activity detection (VAD) uses a representation of speech derived from spectrum analysis, followed by statistical characterization of speech and degrading noise. Features derived using traditional methods may not be adequate for VAD in the case of transient noises. In this paper, we focus on transient noises where most of the VAD systems in literature do not perform well. A high temporal resolution and high frequency resolution representation is used to discriminate the transient noises from speech. The high temporal and frequency resolution representation is achieved by filtering the signal at several single frequencies. The single frequency filtering approach helps to isolate the regions of transient noise in a signal. A time varying threshold is proposed based on the spectral variance and the temporal variance of the speech signal to detect transient noise. The remaining regions are processed by the spectral variance measure for VAD. The results have been compared to the Adaptive Multi-rate (AMR) methods. The performance of proposed method is consistently better due to the instantaneous feature. The percentage of detection of transient noise is higher for the proposed method than the methods reported in the literature.

doi: 10.21437/Interspeech.2014-512

Cite as: Aneeja, G., Yegnanarayana, B. (2014) Speech detection in transient noises. Proc. Interspeech 2014, 2356-2360, doi: 10.21437/Interspeech.2014-512

  author={G. Aneeja and B. Yegnanarayana},
  title={{Speech detection in transient noises}},
  booktitle={Proc. Interspeech 2014},