Detection of Alzheimer's Dementia (AD) is crucial for timely intervention to slow down disease progression. Using spontaneous speech to detect AD is a non-invasive, efficient and inexpensive approach. Recent innovations in self-supervised learning (SSL) have led to remarkable advances in speech processing. In this work, we investigate a set of SSL models using joint fine-tuning strategy and compare their performance with conventional classification model. Our work shows that fine-tuning the pretrained SSL models, in conjunction with multi-task learning and data augmentation, boosts the effectiveness of general-purpose speech representations in AD detection. The results surpass the baseline and are comparable to state-of-the-art performance on the popular ADReSS dataset. We also compare single- and multi-task training for AD classification, and analyze different augmentation methods to show how to achieve improved results.