Pronunciation training is essential for learning a second language, but tools for computer-assisted pronunciation training have only been developed for a small subset of languages. Through creating and evaluating three fine-tuned versions of Wav2Vec2-Bert, this paper investigates the performance of Wav2Vec2-Bert for detecting language learner errors in low-resource settings. The results provide insight into how the data used for fine-tuning can impact performance and a thorough analysis of erroneous predictions. The evaluation of Wav2Vec2-Bert for this task offers a case study of an under-resourced language and suggestions for how a large-language model can be used to develop pronunciation training tools in under-resourced settings.