In 2017, the U.S. National Institute of Standards and Technology (NIST) conducted the most recent in an ongoing series of Language Recognition Evaluations (LRE) meant to foster research in robust text- and speaker-independent language recognition as well as measure performance of current state-of-the-art systems. LRE17 was organized in a similar manner to LRE15, focusing on differentiating closely related languages (14 in total) drawn from 5 language clusters, namely Arabic, Chinese, English, Iberian, and Slavic. Similar to LRE15, LRE17 offered fixed and open training conditions to facilitate cross-system comparisons, and to understand the impact of additional and unconstrained amounts of training data on system performance, respectively. There were, however, several differences between LRE17 and LRE15 most notably including: 1) use of audio extracted from online videos (AfV) as development and test material, 2) release of a small development set which broadly matched the LRE17 test set, 3) system outputs in form of log-likelihood scores, rather than log-likelihood ratios, and 4) an alternative cross-entropy based performance metric. A total of 25 research organizations, forming 18 teams, participated in this 1-month long evaluation and, combined, submitted 79 valid system outputs to be evaluated. This paper presents an overview of the evaluation and an analysis of system performance over all primary evaluation conditions. The evaluation results suggest that 1) language recognition on AfV data was, in general, more challenging than telephony data, 2) top performing systems exhibited similar performance, 3) greatest performance improvements were largely due to data augmentation and use of more complex models for data representation, and 4) effective use of the development set was essential for the top performing systems.