State-of-the-art text-independent speaker recognition systems for long recordings (a few minutes) are based on deep neural network (DNN) speaker embeddings. Current implementations of this paradigm use short speech segments (a few seconds) to train the DNN. This introduces a mismatch between training and inference when extracting embeddings for long duration recordings. To address this, we present a DNN refinement approach that updates a subset of the DNN parameters with full recordings to reduce this mismatch. At the same time, we also modify the DNN architecture to produce embeddings optimized for cosine distance scoring. This is accomplished using a large-margin strategy with angular softmax. Experimental validation shows that our approach is capable of producing embeddings that achieve record performance on the SITW benchmark.