ISCA Archive Interspeech 2016
ISCA Archive Interspeech 2016

Today’s Most Frequently Used F0 Estimation Methods, and Their Accuracy in Estimating Male and Female Pitch in Clean Speech

Sofia Strömbergsson

Variation in fundamental frequency (F0) constitutes a valuable source of information for researches across many disciplines, with a shared interest in speech. Different methods for estimating F0 vary in estimation accuracy and accessibility, and there is yet no gold standard. Through a bibliometric survey, this study examines what methods were the most frequently used in the speech scientific community during the years 2010–2016. Secondly, the most used methods are evaluated against a ground truth reference, with a specific focus on their accuracy in estimating F0 in male and female speakers, respectively.

The results show that Praat is the dominant method by far, followed by STRAIGHT, RAPT and YIN. This pattern holds across a range of different research areas, although within Acoustics and Engineering, Praat’s dominance is less pronounced. In the evaluation including Praat, RAPT and YIN — with their default and gender-adapted settings — Praat also proved to be the most accurate. The finding that adapting Praat’s pitch range settings by gender leads to further improvements should encourage researchers to do this routinely.