In 2015, NIST conducted the most recent in an ongoing series of Language
Recognition Evaluations (LRE) meant to foster research in language
recognition. The 2015 Language Recognition Evaluation featured 20 target
languages grouped into 6 language clusters. The evaluation was focused
on distinguishing languages within each cluster, without disclosing
which cluster a test language belongs to.
The 2015 evaluation
introduced several new aspects, such as using limited and specified
training data and a wider range of durations for test segments. Unlike
in past LRE’s, systems were not required to output hard decisions
for each test language and test segment, instead systems were required
to provide a vector of log likelihood ratios to indicate the likelihood
a test segment matches a target language. A total of 24 research organizations
participated in this four-month long evaluation and combined they submitted
167 systems to be evaluated. The evaluation results showed that top-performing
systems exhibited similar performance and there were wide variations
in performance based on language clusters and within cluster language
pairs. Among the 6 clusters, the French cluster was the hardest to
recognize, with near random performance, and the Slavic cluster was
the easiest to recognize.