ISCA Archive PMLA 2002
ISCA Archive PMLA 2002

Modeling pronunciation variation for ASR: Comparing criteria for rule selection

J. M. Kessens, Helmer Strik, Catia Cucchiarini

In this paper we use a data-driven (DD) rule-based method for modeling pronunciation variation. Error analysis is performed in order to gain insight into the effect of pronunciation variation modeling. This analysis shows that although modeling pronunciation variation brings about improvements, deteriorations are also introduced. A strong correlation is found between the number of improvements and deteriorations per rule. This result indicates that it is not straightforward to improve the performance of automatic speech recognition (ASR) by excluding the rules that cause deteriorations, because these rules also produce a considerable number of improvements. Finally, we compare three different criteria for rule selection. This comparison indicates that the absolute frequency of rule application (Fabs) is the most suitable criterion for rule selection. For the best testing condition, a statistically significant reduction in Word Error Rate (WER) of 1.4% absolute, or 8.2% relative, is found.