ISCA Archive Interspeech 2022
ISCA Archive Interspeech 2022

Challenges in Metadata Creation for Massive Naturalistic Team-Based Audio Data

Chelzy Belitz, John H.L. Hansen

A broad range of research fields benefit from the information extracted from naturalistic audio data. Speech research typically relies on the availability of human-generated metadata tags to comprise a set of “ground truth” labels for the development of speech processing algorithms. While the manual generation of metadata tags may be feasible on a small scale, unique problems arise when creating speech resources for massive, naturalistic audio data. This paper presents a general discussion on these challenges and highlights suggestions when creating metadata for speech resources that are intended to be useful both in speech research and in numerous other fields such as psychology, history, and audio archiving/preservation. Further, it provides an overview of how the task of creating a speech resource for various communities has been and is continuing to be approached for the massive corpus of audio from the historic NASA Apollo missions, which includes tens of thousands of hours of naturalistic, team-based audio data featuring numerous speakers across multiple points in history.