In the field of visual object recognition, it is debated whether objects are stored in memory using abtract, 3D, structural representations, or, rather, quasi-pictorial, low-level, 2D analogical representations. The same question can be translated into the field of auditory word recognition where one can broadly distinguish between two classes of word recognition models: Models in the first class postulate that the mental representations of word forms specify detailed acoustic/phonetic features and the mapping between the signal and these representation is a "direct" comparison. Models in the second class postulate abstract phonological representations and a process of phonological parsing between the signal and these representations. We will present several psycholinguistic experiments that attempt to distinguish between the two types of models.