ISCA Archive Clarity 2025
ISCA Archive Clarity 2025

A Chorus of Whispers: Modeling Speech Intelligibility via Heterogeneous Whisper Decomposition

Longbin Jin, Donghun Min, Eun Yi Kim

This paper introduces a Chorus of Whispers, a simple yet effective method for modeling speech intelligibility in hearing-impaired listeners, developed for the third Clarity Prediction Challenge (CPC3). Our approach simulates a spectrum of perceptual abilities by creating a “chorus” of heterogeneous Whisper models, ranging from the powerful large version to the lightweight tiny variant. By decomposing the audio signal through the diverse outputs of this chorus, we extract robust representations that reflect listening difficulty. These representations are then fed into an ensemble of word- and sentence-level models to predict the final intelligibility score. The proposed method demonstrates strong generalization to unseen conditions, achieving a competitive RMSE of 23.62 on the CPC3 development set.