ISCA Archive Interspeech 2025
ISCA Archive Interspeech 2025

Investigating the Reasoning Abilities of Large Language Models for Understanding Spoken Language in Interpersonal Interactions

Pranjal Aggarwal, Ghritachi Mahajani, Pavan Kumar Malasani, Vaibhav Jamadagni, Caroline J. Wendt, Ehsanul Haque Nirjhar, Theodora Chaspari

This study evaluates large language models’ (LLMs) reasoning capabilities in spoken language understanding (SLU) during interpersonal interactions. We incorporated several factors into LLM prompts: instructing via examples (IE), integrating domain knowledge (DK), and including context (IC). Experiments with Gemini-1.5-pro, GPT-3.5-turbo, and GPT-4o were conducted on an SLU task that classifies the degree of explanation (i.e., under-explained, succinct, comprehensive, over-explained) in job interview responses—an important step toward developing automatic interview training systems. Results demonstrate the feasibility of few-shot (1- to 4-shot) learning, with ablation studies confirming that modifications to prompts, especially when combining IE, DK, and IC, lead to further performance improvements.