The transmission protocol of sustained voiced speech is hypothesized to be based on a fundamental drive process, which synchronizes the vocal tract excitation on the transmitter side and evokes the pitch perception on the receiver side. A band limited fundamental drive is extracted from a voice specific subband decomposition of a speech signal. When the near periodic drive is used as fundamental drive of a two-level drive-response model, a more or less aperiodic voiced excitation can be reconstructed as a more or less aperiodic trajectory on a low dimensional synchronization manifold described by speaker and phoneme specific coupling functions. In the case of vowels and nasals the excitation depends on a single phase of the fundamental drive. In the case of other sustained voiced consonants the excitation may include an additional coupling function, which depends on a delayed fundamental phase with a phoneme specific time delay. The delay may exceed the length of the analysis window. The resulting long range correlation cannot be analysed by methods assuming stationary excitation.
Index Terms. voiced speech, fundamental drive, two-level driveresponse model, generalized synchronization, delayed excitation