It is widely believed that we should apply a common set of test samples to all the recognition systems under test in order to make a reliable performance comparison. But how much is this true? We discuss this problem based on the Akaike Information Criterion (AIC). It becomes clear that by applying a common set of test samples, more discrimination power can be obtained, as has been believed, as to performance difference than by applying independent sets of samples to each system. The difference between them is, however, not so large as might be expected. The effect of applying a common set of test samples to two systems under test becomes prominent when we measure and utilize the number of samples recognized correctly by both systems in addition to the number of samples recognized correctly by each system.