ISCA Archive Eurospeech 1993
ISCA Archive Eurospeech 1993

Waveform similarity based overlap-add (WSOLA) for time-scale modification of speech: structures and evaluation

Marc Roelands, Werner Verhelst

A synchronization criterion for overlap-add time-scale modification is derived through a least squares estimation of the modified short-time Fourier transform. Based on this finding, a structural time-domain framework for time-scale modification is described. One efficient variant, which was called the Waveform Similarity based Overlap-Add (WSOLA) method, produces high quality output when applied to speech, but can even be applied successfully to a broader class of signals, including multiple voices together and musical instruments. Fine-tuning the synchronization criterion, without affecting the high quality that is obtained, can make the computational cost very low, revealing the versatile possibilities for on-line operation.