ISCA Archive Interspeech 2025
ISCA Archive Interspeech 2025

SGED-Probe: Probing E2E ASR decoder and aligner for spoken grammar error detection under three speaking practice conditions

Chowdam Venkata Thirumala Kumar, Chiranjeevi Yarra

Grammar error detection in one's speech, referred to as spoken grammar error detection (SGED), is a critical component for building computer-assisted language learning systems. Traditionally, ASR followed by text-based GED has been used, but ASR errors can limit SGED performance. In this work, we analyse SGED under three speaking conditions: spontaneous, semispontaneous, and memorized, using five state-of-the-art ASRs. We examine the decoded text and ASR cues such as confidence scores and aligner probabilities, out of which, for the latter one, the ground-truth grammatically correct (GGC) text is used. Experiments revealed that autoregressive ASRs show bias toward GGC text, leading to suboptimal performance. It is observed that there is an increase in performance in response to the decreased freedom of speech, with semi-spontaneous speech outperforming memorized speech. Aligner probabilities outperform confidence scores despite the alignment and overconfidence issues.