Speaker Verification and Utterance Verification are examples of techniques that can be used for Speaker Authentication purposes. Speaker Verification consists of accepting or rejecting the claimed identity of a speaker by processing samples of his/her voice. Utterance Verification systems make use of a set of speaker-independent speech models to recognize a certain utterance and decide whether a speaker has uttered it or not. If the utterances consist of passwords, this technique can be used for identity verification purposes. Up to now, both techniques have been used separately. We propose an architecture consisting of both systems working in parallel with a novel output combination technique. Thus, a Neural Network is designed to learn from the data how to balance the influence of both outputs in order to jointly minimize the False Acceptance and False Rejection rates. The better performance of this architecture is compared with those of the individual systems in an over the phone speaker recognition task.