ISCA Archive Interspeech 2025
ISCA Archive Interspeech 2025

Speech-Based Automatic Chronic Kidney Disease Diagnosis via Transformer Fusion of Glottal and Spectrogram Features

Jihyun Mun, Minhwa Chung, Sunhee Kim

Chronic kidney disease (CKD) is a global health concern characterized by a gradual and irreversible decline in kidney function. Early diagnosis and timely intervention are crucial, yet current methods rely primarily on invasive blood and urine tests. Since CKD affects the respiratory system and alters speech production, vocal characteristics may serve as biomarkers for disease detection. This study proposes a deep learning-based approach that integrates spectrogram and glottal features for CKD diagnosis. Spectrograms capture broad acoustic characteristics, whereas glottal features, known to be influenced by CKD, provide complementary phonatory information. To effectively fuse these features, we employ a transformer-like architecture. The proposed method achieves an accuracy and a macro F1 score of 0.96, demonstrating its potential as an objective, non-invasive diagnostic tool. In addition, we analyze attention weights and gradient-based saliency maps to enhance model interpretability.