ISCA Archive Eurospeech 1997
ISCA Archive Eurospeech 1997

Extraction and representation rhythmic components of spontaneous speech

Shigeyoshi Kitaazawa, Hideya Ichikawa, Satoshi Kobayashi, Yukihiro Nishinuma

Speech speed is measured and displayed with our specific algorithm TEMAX (Temporal Evaluation and Measurement Algorithm by KS). The TEMAX-gram, a sonagraphic output of speech envelope, the DFT using a 1-second window is convenient to set off isosyllabic characteristics. For Japanese traces 2 dark bars, called rhythmic formants: RF1 and RF2: the first one, around 8 Hz, and the second one, at halfway. RF1 corresponds to speech rate, RF2 represents the bimoraic rhythmic foot. As far as English, its isochronic characteristics are observable with a 2-seconds window as RF1. Furthermore, using a 1-second window the periodicity of syllables between stress is displayed as RF2.