Iterative adaptive inverse filtering (IAIF) [1] remains among the state-of-the-art
algorithms for estimating glottal flow from the recorded speech signal.
Here, we re-examine IAIF in light of its foundational, classical model
of voiced (non-nasalized) speech, wherein the overall spectral tilt
is caused only by lip-radiation and glottal effects, while the vocal-tract
transfer function contains formant peaks but is otherwise not tilted.
In contrast, IAIF initially models and cancels the formants after only
a first-order preemphasis of the speech signal, which is generally
not enough to completely remove spectral tilt.
Iterative optimal
preemphasis (IOP) is therefore proposed to replace IAIF’s initial
step. IOP is a rapidly converging algorithm that models a signal (then
inverse-filters it) with one real pole (zero) at a time, until spectral
tilt is flattened. IOP-IAIF is evaluated on sustained /a/ in a range
of voice qualities from weak-breathy to shouted-tense. Compared with
standard IAIF, IOP-IAIF yields: (i) an acceptable glottal flow even
for a weak breathy voice that the standard algorithm failed to handle;
(ii) generally smoother glottal flows that nevertheless retain pulse
shape and closed phase; and (iii) enhanced separation of voice qualities
in both normalized amplitude quotient (NAQ) and glottal harmonic spectra.