ISCA Archive Interspeech 2024
ISCA Archive Interspeech 2024

A Dataset and Two-pass System for Reading Miscue Detection

Raj Gothi, Rahul Kumar, Mildred Pereira, Nagesh Nayak, Preeti Rao

Automatic speech recognition (ASR) has long been viewed as a promising solution to the resource-intensive task of oral reading fluency assessment. The demands on ASR accuracy, however, tend to be high, especially when applied to obtaining reliable reading diagnostics. The prior knowledge of reading prompts is typically used to limit the system WER. The accurate detection of mispronounced words, which can be relatively few in number, while limiting false positives, remains challenging. In this work, we present a new manually transcribed dataset of 1,110 elementary school children reading connected text in L2 English with wide-ranging proficiencies. Apart from local features derived from alternate decodings under different linguistic context constraints, we use an additional deep acoustic model. We discuss the performance gains achieved in a second pass over initial hybrid ASR hypotheses.