The AzAR functionality provides several audio-visual modes of user feedback, e. g. showing animated articulatory organs to correct wrong movements of tongue, lips, etc. or playing back reference utterances but the core function is marking mispronounced phones within the spoken utterance using a coloured scale from red (“bad”) to green (“good”):
The marking of mispronounced parts on a user’s utterances is based on different phonetic-phonologic and prosodic distance measures - identifying typical cross-lingual influences from the native L1 source language on the L2 target language taught, such as: Confusion of specific phoneme classes, Wrong phoneme duration, Articulation mistakes e. g. voicing unvoiced phonemes.
The practical implementation uses confidence measures on the segmental level from a HMM based speech recognizer. The AzAR programme structure follows an extensive phonetic curriculum, containing contrastive exercises, insertion tests, etc. which has been compiled from real lessons and is supplemented by a glossary. The AzAR 2 prototype was tested and optimized for Russian migrants in Dresden learning German and runs on PC (Linux, Windows, Mac OSX). It involves a reference speech database for the given language pair combination Russian (L1)/German (L2).