To be perceived as trustworthy, artificially generated text must be sufficiently congruent with the available discourse history. Pre-trained language models (LMs) operating in generative mode are capable of predicting locally coherent phrases, but those do not always reflect salient syntactic, semantic, or pragmatic facets of prior content. This paper introduces a learnable evaluation metric to assess the pragmatic pertinence of LM-generated text for a given history. Pertinence is closely aligned with qualitative human judgments of acceptability, thereby emerging as a blend of sensibleness and specificity. Experiments conducted across different domains using different learning architectures show that this approach circumvents the issue of multiple valid ground-truths, while providing a reliable quantitative ranking of generated text completion candidates in context. Pertinence scoring could thus prove useful for the detection of hallucinations.