We introduce the SRI CLEO (Conversational Language about Everyday Objects)
Speaker-State Corpus of speech, video, and biosignals. The goal of
the corpus is providing insight on the speech and physiological changes
resulting from subtle, context-based influences on affect and cognition.
Speakers were prompted by collections of pictures of neutral everyday
objects and were instructed to provide speech related to any subset
of the objects for a preset period of time (120 or 180 seconds depending
on task).
The corpus provides signals for 43 speakers under four different
speaker-state conditions: (1) neutral and emotionally charged audiovisual
background; (2) cognitive load; (3) time pressure; and (4) various
acted emotions. Unlike previous studies that have linked speaker state
to the content of the speaking task itself, the CLEO prompts remain
largely pragmatically, semantically, and affectively neutral across
all conditions. This framework enables for more direct comparisons
across both conditions and speakers. The corpus also includes more
traditional speaker tasks involving reading and free-form reporting
of neutral and emotionally charged content. The explored biosignals
include skin conductance, respiration, blood pressure, and ECG. The
corpus is in the final stages of processing and will be made available
to the research community.