This paper presents a new approach to synthesizing fast speech in unit selection synthesis. After recording two inventories - one at normal and one at fast speech rate articulated as accurately as possible - speech was synthesized from both corpora independently. Since fast speech differs from normal rate speech in terms of acoustic characteristics, the concept of multi-phone (phoxsy) units [1] was implemented and used to synthesize speech at both speaking rates again. A perceptual evaluation showed that phoxsy units enhanced#the iontelligibility for fast speech synthesis significantly.
index Terms: fast speech, unit selection, phoxsy units
Breuer, S., Abresch, J. "Phoxsy: Multi-phone segments for unit selection speech synthesis. In Interspeech-2004 (ICSLP)