Some theories concerning speech mechanisms assume that overlapping representations are involved in programming certain articulatory gestures and hand actions. In previous studies we have shown a compatibility effect between pronouncing or hearing meaningless syllables like [kɑ] and [ti] and simultaneously performing a power or a precision grip, respectively. The present study investigated whether action selection was necessary for the effect to manifest. The participants were visually presented with a cue for the upcoming manual response. After that, a written syllable “ka” or “ti” was presented at which point the cued grip was performed and the syllable pronounced. There was also a condition, where the grip was cued but only the vocal response was performed. Manual and vocal reaction times were relatively faster when the grip and syllable were compatible (e.g. power & [kɑ]) rather than incompatible (e.g. precision & [kɑ]). When no grip was performed (only cued), the effect was still apparent in vocal reaction times. These results suggest that preparation of a manual action is sufficient to influence vocalizations, and also that action selection is not, however, mandatory for this kind of syllable-grip correspondence.