Clinical depression is one of the most common mental disorders and technology for remote assessment of depression, including monitoring of treatment responses, is gaining more and more importance. Using a cloud-based multimodal dialog platform, we conducted a crowdsourced study to investigate the effect of depression severity and antidepressant use on various acoustic, linguistic, cognitive, and orofacial features. Our findings show that multiple features from all tested modalities show statistically significant differences between subjects with no or minimal depression and subjects with more severe depression symptoms. Moreover, certain acoustic and visual features show significant differences between subjects with moderately severe or severe symptoms who take antidepressants and those who do not take any. Machine learning experiments show that subjects with and without medication can be better discriminated from each other at higher severity levels.