ISCA Archive Eurospeech 1997
ISCA Archive Eurospeech 1997

Semi-automatic phonetic labelling of large corpora

O. Mella, D. Fohr

The aim of the present paper is to present a methodology to semi-automatically label large corpora. This methodology is based on three main points: using several concurrent automatic stochastic labellers, decomposing the labelling of the whole corpus into an iterative refining process and building a labelling comparison procedure which takes into account phonologic and acoustic-phonetic rules to evaluate the similarity of the various labelling of one sentence. After having detailed these three points, we describe our HMM-based labelling tool and we describe the application of that methodology to the Swiss French POLYPHON database.


doi: 10.21437/Eurospeech.1997-491

Cite as: Mella, O., Fohr, D. (1997) Semi-automatic phonetic labelling of large corpora. Proc. 5th European Conference on Speech Communication and Technology (Eurospeech 1997), 1731-1734, doi: 10.21437/Eurospeech.1997-491

@inproceedings{mella97_eurospeech,
  author={O. Mella and D. Fohr},
  title={{Semi-automatic phonetic labelling of large corpora}},
  year=1997,
  booktitle={Proc. 5th European Conference on Speech Communication and Technology (Eurospeech 1997)},
  pages={1731--1734},
  doi={10.21437/Eurospeech.1997-491},
  issn={1018-4074}
}