Automatic speech recognition is hindered by the linguistic differences occurring in accented speech. This paper advances a classification method for accented speech using a CNN-based model trained and tested on English with Germanic, Romance and Slavic accents. The input feature set was examined to find the optimal combination of time-frequency and energy characteristics of speech fed into the machine learning model. We also tuned model hyperparameters and the dimensionality of input features. We argue that mel-scale amplitude spectrograms on a liner scale appear more powerful in accent classification tasks compared to conventional feature sets based on MFCCs and raw spectrograms. Our models used only sparse data from the Speech Accent Archive, yet produced state-of-the-art classification results for English with Germanic, Romance and Slavic accents. The accuracy of our models trained on linear scale amplitude mel-spectrograms ranged from 0.964 to 0.987, outperforming existing models classifying accents using the same dataset.