ISCA Archive Interspeech 2015
ISCA Archive Interspeech 2015

Exploring ANN back-ends for i-vector based speaker age estimation

Anna Fedorova, Ondřej Glembek, Tomi Kinnunen, Pavel Matějka

We address the problem of speaker age estimation using i-vectors. We first compare different i-vector extraction setups and then focus on (shallow) artificial neural net (ANN) back-ends. We explore ANN architecture, training algorithm and ANN ensembles. The results on NIST 2008 and 2010 SRE data indicate that, after extensive parameter optimization, ANN back-end in combination with i-vectors reaches mean absolute errors (MAEs) of 5.49 (females) and 6.35 (males), which are 4.5% relative improvement in comparison to our support-vector regression (SVR) baseline. Hence, the choice of back-end did not affect the accuracy much; a suggested future direction is therefore focusing more on front-end processing.