This study investigates the use of audiovisual cues in the perception of sound contrasts which have a different phonemic status in the listenersÂ’ L1 and L2. Two contrasts differing in the distinctiveness of their visual gestures (/b/-/v/ and /p/-/b/) were presented to Spanish learners of English in audio, visual and audiovisual modalities. Overall identification rates were not significantly higher audiovisually than in the audio alone condition for either contrast. For the /b/-/v/ contrast, which is visually marked, listeners showed different patterns of performance. A subset of listeners appeared to have acquired the L2 /b/-/v/ contrast and were sensitive to both the acoustic and visual cues marking the contrast. Those at an earlier stage of acquisition of the L2 contrast generally showed poor sensitivity to the visual as well as the acoustic cues.