This paper presents a framework for investigating the relationship between both the auditory and visual modalities in speech. This framework employs intentional agents to analyse multilinear bimodal representations of speech utterances in line with an extended computational phonological model.