ISCA Archive Interspeech 2014
ISCA Archive Interspeech 2014

The use of low-frequency ultrasound for voice activity detection

Ian Vince McLoughlin

An active detection system is developed which uses low-power low-frequency ultrasonic reflection to determine the lip state (i.e. whether open, closed or in between) of a human speaker and hence the presence of vocal activity. In operation, a small loudspeaker or sounder, located within a few centimetres of the lips, produces an excitation signal which is emitted towards the lips. A co-located microphone receives the signal reflected from the lip region. Even simple analysis of the reflected information reveals whether the mouth is open or closed. Given an excitation located above the normal frequency range of human speech, the method is unaffected by speech energy. If the excitation frequency is moved above the normal threshold of human hearing (i.e. an ultrasonic excitation), the method is inaudible. Careful placement of the excitation signal at the extreme low end of the ultrasonic range, allows its generation and analysis to be done with inexpensive off-the-shelf audio hardware. This paper describes the techniques used, presents experimental details regarding the signals, then implement and evaluates a simple voice activity detector based on the technique.