ISCA Archive Interspeech 2007
ISCA Archive Interspeech 2007

Environmentally aware voice activity detector

Abhijeet Sangwan, Nitish Krishnamurthy, John H. L. Hansen

Traditional voice activity detectors (VADs) tend to be deaf to the acoustical background noise, as they (i) utilize a single operating point for all SNRs (signal-to-noise ratios) and noise types, and (ii) attempt to learn the background noise model online from finite data length. In this paper, we address the aforementioned issues by designing an environmentally aware (EA) VAD. The EA VAD scheme builds prior offline knowledge of commonly encountered acoustical backgrounds, and also combines the recently proposed competitive Neyman-Pearson (CNP) VAD with a SVM (support vector machine) based noise classifier. In operation, the EA VAD obtains accurate noise models of the acoustical background by employing the noise classifier and its prior knowledge of the noise type, and thereafter uses this information to set the best operating point and initialization parameters for the CNP VAD. The superior performance of the EA VAD scheme over the standard AMR (adaptive multi-rate) VADs in low SNR is confirmed in a simulation study, where speech and noise data were drawn from the SWITCHBOARD and NOISEX databases. We report an absolute improvement of 10-15% in detection rates over AMR VADs in low SNR for different noise types.