In this work we propose a machine learning model for depression detection from transcribed clinical interviews. Depression is a mental disorder that impacts not only the subject’s mood but also the use of language. To this end we use a Hierarchical Attention Network to classify interviews of depressed subjects. We augment the attention layer of our model with a conditioning mechanism on linguistic features, extracted from affective lexica. Our analysis shows that individuals diagnosed with depression use affective language to a greater extent than not-depressed. Our experiments show that external affective information improves the performance of the proposed architecture in the General Psychotherapy Corpus and the DAIC-WoZ 2017 depression datasets, achieving state-of-the-art 71.6 and 70.3 using the test set, F1-scores respectively.