Recently, the rapid advancements in Speech Large Language Models (SpeechLLMs) have greatly accelerated progress in the Speech Emotion Recognition (SER) field. However, SpeechLLMs rely on powerful semantic encoders and acoustically irrelevant pre-training data, granting limited attention to acoustic information, which is closely related to the emotion in speech. In this paper, we leverage acoustic properties correlated with emotions to automatically generate acoustic descriptions. These descriptions are combined with the semantic representations as inputs to the LLM, enhancing emotion recognition capabilities. Accordingly, we propose AA-SLLM, an acoustically augmented SpeechLLM adopting instruction fine-tuning via Low-Rank Adaptation (LoRA). Experimental results indicate that AA-SLLM effectively alleviates the class imbalance problem while improving overall performance. Furthermore, AA-SLLM achieves state-of-the-art results on IEMOCAP, MELD, and LSSED datasets.