The analysis of large spontaneous speech corpora reveals that creaky mode appears more frequently than expected, especially for young female speakers. Creaky mode usually creates fundamental frequency measurement errors and creaky voice segments must be often identified manually beforehand to avoid erroneous reading of F0 in large speech databases. Various approaches have been proposed to identify creaky segments with diplophonic and vocal fry automatically, based on autocorrelation, AMDF, HMM, pitch markers, etc. The approach proposed here is based on narrow band Fourier spectrum analysis, operating not on a single frame but on the evaluation of sudden changes in the harmonic distribution of consecutive frames. The implemented algorithm simulates the visual detection of creak from spectrographic display where so-called sub harmonics appear on short voice segments.
Index Terms: creaky voice, diplophonia, vocal fry, fundamental frequency, spontaneous speech.