ISCA Archive WOCCI 2009
ISCA Archive WOCCI 2009

A review of ASR technologies for children’s speech

Matteo Gerosa, Diego Giuliani, Shrikanth Narayanan, Alexandros Potamianos

In this paper, we review: (1) the acoustic and linguistic properties of children’s speech for both read and spontaneous speech, and (2) the developments in automatic speech recognition for children with application to spoken dialogue and multimodal dialogue system design. First, the effect of developmental changes on the absolute values and variability of acoustic correlates is presented for read speech for children ages 6 and up. Then, verbal child-machine spontaneous interaction is reviewed and results from recent studies are presented. Age trends of acoustic, linguistic and interaction parameters are discussed, such as sentence duration, filled pauses, politeness and frustration markers, and modality usage. Some differences between child-machine and humanhuman interaction are pointed out. The implications for acoustic modeling, linguistic modeling and spoken dialogue system design for children are presented. We conclude with a review of relevant applications of spoken dialogue technologies for children.