ISCA Archive Clarity 2025
ISCA Archive Clarity 2025

Domain-Adapted Automatic Speech Recognition with Deep Neural Networks for Enhanced Speech Intelligibility Prediction

Haeseung Jeon, Jiwoo Hong, Saeyeon Hong, Hosung Kang, Bona Kim, Se Eun Oh, Noori Kim
While previous studies have shown that adapting Automatic Speech Recognition (ASR) models can outperform intrusive methods, many existing approaches still rely on pre-trained ASR models without domain-specific adaptation. In this work, we investigate the effect of fine-tuning ASR models using a domain-specific signal dataset to improve representation quality. Furthermore, we conduct a comparative evaluation of two prominent Deep Neural Network (DNN) architectures for audio modeling, such as Convolutional Neural Networks (CNNs) and Transformers. Notably, both models outperform the Hearing Aid Speech Perception Index (HASPI) score, with the Transformer-based model demonstrating higher performance due to its ability to capture global contextual information.