Depression is one of the most common mental diseases nowadays, which seriously affects the health of individuals. Some researchers have shown an association between the level of depression and speech features in individuals, so a lot of automatic speech-based depression detection systems have been proposed. A number of studies utilized convolutional neural network (CNN) to realize the speech depression detection. However, most of these studies did not take into account that different frequencies and time steps in the speech spectrum features contribute unequally to the detection of depression. In order to extract more significant and distinctive features, this paper proposes an effective frequency-time attention (FTA) module for CNN, which is based on squeeze and excitation operations and can emphasize the time steps and frequencies associated with depression. Experimental results based on the AVEC 2013 and AVEC 2014 benchmarks demonstrate the effectiveness of our proposed method.