ISCA Archive Interspeech 2023
ISCA Archive Interspeech 2023

Automatic Deep Neural Network-Based Segmental Pronunciation Error Detection of L2 English Speech (L1 Bengali)

Puja Bharati, Sabyasachi Chandra, Shayamal Kumar Das Mandal

In the last few decades, English has become a popular language as it helps us to communicate with the global world. A large population of English learners find it challenging to achieve an 'acceptable' and 'intelligible' pronunciation. To overcome these issues, various computer-assisted pronunciation training tools are designed where automatic pronunciation error detection (APED) is a core component of the system. Most of the works of APED are based on European English speech, but there is no such work reported for Bengali English speech. This paper proposes a system for pronunciation error detection of L2 English speech (L1 Bengali) at phoneme/segmental level using a hybrid convolutional neural network and long short-term memory modules with CTC loss. Experiments are done based on newly created L2 English speaker (L1 Bengali) speech data. The results demonstrate that the proposed system outperforms the goodness of pronunciation-based methods by 15% in terms of F1 score using fbank.