In representation learning, the promise of disentanglement methods is to decompose an input signal into a set of independent and interpretable attributes. Some metrics, such as the DCI or MIG scores, have been proposed to evaluate how much this goal is reached. They analyse the relationship between the representation components and the desirable attributes. This paper shows that, even when applied to synthetic datasets generated from a closed list of generative factors, these metrics can be too optimistic. In particular, it reports that a generative factor can be recovered from an altered disentangled representation from which it has been supposedly removed, according to the metrics. Based on this observation, a new criterion called latent decimation is proposed to evaluate disentanglement through the accuracy of factors prediction from subsets of latents. A new metric called MIDCI is defined, and its relevance is demonstrated on voice data.