ISCA Archive Eurospeech 1993
ISCA Archive Eurospeech 1993

Controlling search in segmentation lattices of speech signals

Kai Hübener, Andreas Hauenstein

Multi-level segmentation of speech signals has become increasingly popular for quite a while. It yields a rich representation which captures both coarse and fine acoustic information in a uniform framework. However, determining which segments are useful, and how they should be combined to arrive at the correct segmentation of the acoustic signal has proved to be rather difficult. The approach described in this paper uses segment classification confidences as well as dynamically generated segment duration constraints for disambiguation. An experimental upper bound on performance using duration constraints is determined. Experiments show that the results compare well with a manual phonetic transcription.