Autism spectrum disorder (ASD) is a complex neurodevelopmental condition of unclear cause and varying severity, often studied using mice as animal models. To accurately distinguish between wild type and ASD model phenotypes, we present a deep learning approach that uses ultrasonic vocalization as input. The proposed method combines a simple model architecture, of convolutional and fully connected layers, with a high-resolution representation of auditory and ultrasound time-frequency patterns. Our approach surpasses baseline performance, achieving an unweighted average recall score of 0.806 on 30-second ultrasonic vocalization fragments. This work was conducted for the 1st INTERSPEECH Mice Autism Detection via Ultrasound Vocalization (MAD-UV) Challenge, where it achieved the highest score among all submitted solutions.