ISCA Archive Eurospeech 1997
ISCA Archive Eurospeech 1997

Signal driven generation of word baseforms from few examples

Andreas Hauenstein

The work described in this paper attempts to automatically generate word baseforms as used in the pronunciation dictionaries of large vocabulary speech recognition systems. The input to the algorithm consists of several sample utterances per word. No additional information, like e.g. word spelling, is used. The task involves determining a suitable inventory of subword units (SWU) as well as determining the baseforms themselves. Experiments show that improvements over a triphone based dictionary are possible with less than ten sample utterances per word if test and training vocabularies are different. A possible application would be a system based on a fixed inventory of HMM-models that needs to be adapted to different vocabularies.


doi: 10.21437/Eurospeech.1997-360

Cite as: Hauenstein, A. (1997) Signal driven generation of word baseforms from few examples. Proc. 5th European Conference on Speech Communication and Technology (Eurospeech 1997), 1031-1034, doi: 10.21437/Eurospeech.1997-360

@inproceedings{hauenstein97b_eurospeech,
  author={Andreas Hauenstein},
  title={{Signal driven generation of word baseforms from few examples}},
  year=1997,
  booktitle={Proc. 5th European Conference on Speech Communication and Technology (Eurospeech 1997)},
  pages={1031--1034},
  doi={10.21437/Eurospeech.1997-360},
  issn={1018-4074}
}