ISCA Archive Interspeech 2023
ISCA Archive Interspeech 2023

An Outlier Analysis of Vowel Formants from a Corpus Phonetics Pipeline

Emily P. Ahn, Gina-Anne Levow, Richard A. Wright, Eleanor Chodroff

With the growing availability of large-scale spoken databases, linguists are increasingly relying on automated tools to obtain time alignments of sound units to the speech signal. A typical automated pipeline may involve grapheme-to-phoneme conversion, forced alignment, and acoustic-phonetic measurement, and each of these stages requires a strong assumption regarding the output quality. We investigate these assumptions by auditing outliers in vowel formants from two multilingual read speech corpora, CMU Wilderness and Mozilla Common Voice, across three languages: Hausa, Kazakh, and Swedish. From this audit, we develop a novel outlier taxonomy that includes the broad outlier categories of transcript errors, alignment errors, formant tracking errors, linguistic variations, and fine samples. We show the utility of this outlier analysis in identifying weaknesses in corpus-specific and corpus-general pipeline assumptions, and discovering characteristics of particular languages.


doi: 10.21437/Interspeech.2023-1052

Cite as: Ahn, E.P., Levow, G.-A., Wright, R.A., Chodroff, E. (2023) An Outlier Analysis of Vowel Formants from a Corpus Phonetics Pipeline. Proc. INTERSPEECH 2023, 2573-2577, doi: 10.21437/Interspeech.2023-1052

@inproceedings{ahn23_interspeech,
  author={Emily P. Ahn and Gina-Anne Levow and Richard A. Wright and Eleanor Chodroff},
  title={{An Outlier Analysis of Vowel Formants from a Corpus Phonetics Pipeline}},
  year=2023,
  booktitle={Proc. INTERSPEECH 2023},
  pages={2573--2577},
  doi={10.21437/Interspeech.2023-1052}
}