ISCA Archive SLaTE 2009
ISCA Archive SLaTE 2009

The goodness of pronunciation algorithm: a detailed performance study

Sandra Kanters, Catia Cucchiarini, Helmer Strik

An inventory was compiled of pronunciation errors frequently made by foreigners speaking Dutch. On the basis of this inventory artificial errors were created in a native development corpus, which in turn were used to optimize thresholds for the Goodness of Pronunciation (GOP) algorithm. In the current study the GOP algorithm is evaluated in three different ways: (1) using a native test corpus with artificial errors which reflect errors frequently made by non-natives, (2) within an actual application used by non-natives for practicing pronunciation, and (3) post-hoc, using the recorded interactions of the pronunciation training application, to determine what the performance of the algorithm would have been if optimal speaker and phone specific thresholds had been used. The results show that the performance of the GOP algorithm was satisfactory and that the procedure by which thresholds were determined by simulating realistic pronunciation errors was appropriate, because performance on the artificially introduced errors closely approximated performance on real data. This finding is particularly welcome if we consider that, in general, paucity of data is a common problem in this kind of research. Furthermore, it appeared that post-hoc threshold optimization only led to a slight increase in performance.

Index Terms: Goodness of Pronunciation (GOP), pronunciation error detection, Computer Assisted Pronunciation Training (CAPT)

doi: 10.21437/SLaTE.2009-13

Cite as: Kanters, S., Cucchiarini, C., Strik, H. (2009) The goodness of pronunciation algorithm: a detailed performance study. Proc. Speech and Language Technology in Education (SLaTE 2009), 49-52, doi: 10.21437/SLaTE.2009-13

  author={Sandra Kanters and Catia Cucchiarini and Helmer Strik},
  title={{The goodness of pronunciation algorithm: a detailed performance study}},
  booktitle={Proc. Speech and Language Technology in Education (SLaTE 2009)},