In order to find the best way of synthesis unit concatenation, this paper focuses on the auditory detectability of discontinuities. These may appear as jumps in energy, f0 and spectral characteristics due to concatenation. After introducing the main causes and effects of different discontinuities, a new test method is presented aiming at investigating the detectability of concatenation phenomena. As stimuli for this test synthetic sentences have been produced under controlled conditions. From the test results conclusions will be drawn on concatenation techniques and the test method itself.
Keywords: text-to-speech; synthesis-by-concatenation; concatenation quality evaluation; speech perception