ISCA Archive Interspeech 2008
ISCA Archive Interspeech 2008

Evaluation of voice activity and voicing detection

Bojan Kotnik, Pierre Sendorek, Sergey Astrov, Turgay Koc, Tolga Ciloglu, Laura Docío Fernández, Eduardo Rodríguez Banga, Harald Höge, Zdravko Kačič

This paper describes the ECESS evaluation campaign of voice activity and voicing detection. Standard VAD classifies signal into speech and non-speech, we extend it to VAD+ so that it classifies a signal as a sequence of non-speech, voiced and unvoiced segments. The evaluation is performed on a portion of the Spanish SPEECON database with manually labeled segmentation. To avoid errors caused by the limited precision of manual labeling we introduce "dead zones" - tolerance intervals +-5 ms around label changes in the data set. In these tolerance intervals we don't evaluate the signal.