In this paper, we propose two neural network-based approaches, namely, One Speaker One Network and One Speaker Multiple Networks, for text-dependent speaker verification using suprasegmental features. The suprasegmental features used for this study are pitch accent and durational features. These features are extracted using properties of intonation patterns and duration. We have proposed an approach to combine evidence present at the segmental and suprasegmental levels to improve the performance of the verification system.