Self-supervised learning (SSL) models have revolutionized speech representation by extracting rich acoustic and phonetic features with minimal labeled data. However, their computational demands during fine-tuning and vulnerability to catastrophic forgetting pose challenges for practical deployment. Parameter-Efficient fine-tuning (PEFT) methods, such as prompt tuning, are often employed to address these challenges. While prompt tuning has been successful with large language models in natural language processing (NLP), it struggles to learn effective instructional signals when adapting to speech SSL models, likely due to insufficient a priori knowledge that hinders soft token learning during fine-tuning. We introduce Deep Filter Tuning (DFT), a soft-token adaptation strategy that selectively filters semantic information from noise-distorted representations. By modifying only 0.38% of model weights, DFT achieves a 12% performance gain in noisy environments, offering an efficient solution for robust speech recognition under challenging conditions such as noise adaptation.