Talkers are identified more accurately when the language or accent is familiar to the listener. This is presumably due to access to linguistically relevant cues on top of lower-level acoustic information—an explanation implying that reliance on acoustic cues should decrease as language or accent familiarity increases. We tested this prediction by training Mandarin-speaking listeners to identify talkers while listening to Mandarin-accented English (MAE) and Mandarin (NM) and English (NE) speech produced by their respective native speakers. Using representational similarity analysis, we compared the listeners’ responses with the talkers’ acoustic features (e.g., F0, jitter) to assess acoustic-cue reliance. Results showed greater reliance on acoustic cues in less familiar contexts, supporting the prediction. Notably, in MAE, listeners initially relied more on acoustic cues but later shifted to reduced reliance, highlighting the dynamic nature of talker identification strategies.