ISCA Archive ICSLP 2002
ISCA Archive ICSLP 2002

Predicting oral reading miscues

Jack Mostow, Joseph Beck, S. Vanessa Winter, Shaojun Wang, Brian Tobin

This paper explores the problem of predicting specific reading mistakes, called miscues, on a given word. Characterizing likely miscues tells an automated reading tutor what to anticipate, detect, and remediate. As training and test data, we use a database of over 100,000 miscues transcribed by University of Colorado researchers. We explore approaches that exploit different sources of predictive power: the uneven distribution of words in text, and the fact that most miscues are real words. We compare the approachesÂ’ ability to predict miscues of other readers on other text. A simple rote method does best on the most frequent 100 words of English, while an extrapolative method for predicting real-word miscues performs well on less frequent words, including words not in the training data.