ISCA Archive SpeechProsody 2024
ISCA Archive SpeechProsody 2024

Machine Learning Facilitated Investigations of Intonational Meaning: Prosodic Cues to Epistemic Shifts in American English Utterances

Nanette Veilleux, Stefanie Shattuck-Hufnagel, Sunwoo Jeong, Alejna Brugos, Byron Ahn

This work analyzes experimentally elicited speech to capture the relationship between prosody and semantic/pragmatic meanings. Production prompts were comicstrips where contexts were manipulated along axes prominently discussed in sem/prag literature. Participants were tasked with reading lines as the speaker would, uttering a target phrase communicating a proposition p (e.g., “only marble is available”) to a hearer who had epistemic authority on p. Prompts varied whether the speaker’s initial belief (prior bias) was confirmed (condition A: bias=p) or corrected (condition B: bias=¬p); this meaning difference was reinforced by response particles (A: “okay so” vs. B: “oh really”) preceding the target phrase. Over 475 productions were annotated with phonologically-informed phonetic labels (PoLaR). To model many-to-many mappings between features (prosodic form) and classification (sem/prag meaning), Random Forests were designed on labels and derived measures (including f0 ranges, slopes, TCoG) from 299 recordings — classifying meaning with high accuracy (>85%). RFs identified condition-distinguishing prosodic cues in both response particle and target phrases, leading to questions of how/whether functionally-overlapping lexical content might affect prosodic realization. Moreover, RFs identified phrase-final f0 as important, leading to deeper edge-tone explorations. These highlight how explanatory ML models can help iteratively improve targeted analysis.