This paper presents an ITMO university system submitted to the Speakers in the Wild (SITW) Speaker Recognition Challenge. During evaluation track of the SITW challenge we explored conventional universal background model (UBM) Gaussian mixture model (GMM) i-vector systems and recently developed DNN-posteriors based i-vector systems. The systems were investigated under the real-world media channel conditions represented in the challenge. This paper discusses practical issues of the robust i-vector systems training and performs investigation of denoising autoencoder (DAE) based back-end when applied to “in the wild” conditions. Our speak-er diarization approach for “multi-speaker in the file” conditions is also briefly presented in the paper. Experiments per-formed on the evaluation dataset demonstrate that DNN- based i-vector systems are superior to the UBM-GMM based sys-tems and applying DAE-based back-end helps to improve system performance.