Automatic Speaker Verification (ASV) is extensively used in many security-sensitive domains, but the increasing prevalence of adversarial attacks has seriously compromised the trustworthiness of these systems. Targeted black-box attacks emerge as the most formidable threat, proving incredibly challenging to counteract. However, existing defenses exhibit limitations when applied in real-world scenarios. We propose VoiceDefense - a novel adversarial sample detection method that slices an audio sample into multiple segments and captures their local audio features with segment-specific ASV scores. These scores present distributions that vary distinctly between genuine and adversarial samples, which VoiceDefense leverages for detection. VoiceDefense outperforms the state of the art with a best AUC of 0.9624 and is consistently effective against various attacks and perturbation budgets, all while maintaining remarkably low computational overhead.