ISCA Archive Interspeech 2023
ISCA Archive Interspeech 2023

Exploring Auditory Attention Decoding using Speaker Features

Zelin Qiu, Jianjun Gu, Dingding Yao, Junfeng Li

The auditory attention decoding (AAD) approach aims to determine the identity of the attended talker in a multi-talker scenario using neuro recordings. In the past few years, various AAD methods have been proposed, and most of them rely on speech envelope reconstruction, which unfortunately face challenges with shortened decoding windows. Inspired by the findings that voices with different acoustic features arouse diverse brain activities in a very short period, this paper proposes to use speaker voice features instead of speech envelope as a speaker indicator for conducting AAD in short-time situations. To achieve this, a novel dual-branch convolutional network (DBCNet) is proposed to estimate speaker features from EEG. Results show that the proposed method achieves higher decoding accuracy than existing methods for short decoding windows (approximately 75% for 0.3-s window and 82% for 1.0-s window).