We are working on North Sa ́mi, an under-resourced language, for which we have less than ten hours of transcribed speech in total. Previously, we applied wav2vec 2.0 pretrained large Transformer models to this data. However, error rates were still high. Here, we present a series of system improvements to these models, yielding minor performance improvements. We also experiment with a slightly larger text corpus, which provides a further minor performance improvement. Nonetheless, we conclude that more transcribed speech is needed, at least so that standard size development and test sets can be created.