Dementia places an immeasurable burden on affected individuals and caregivers. In addition to general cognitive decline, dementia has a negative impact on communication. Technical activation systems are thus in high demand, as cognitive activation may help to moderate the decline. However, effective activation requires sustained engagement — which, in turn, first needs to be reliably recognized. In this study, we examine emotional engagement recognition for People with Dementia (PwD) using non-intrusive biosignals resulting from speech communication and facial expressions. PwD suffering from mild to severe dementia used a tablet-based activation system over multiple sessions. We demonstrate that they retained their ability to verbally express emotional engagement even at severe stages of the disease. For recognition of emotional engagement, we propose an architecture of Bidirectional Long-Short-Term-Memory Networks that combines video information with up to three speech-based feature sets (eGeMAPS, ComParE’13, DeepSpectrum). Using data of 24 PwD, we show that adding speech improves recognition performance significantly compared to a video-only model. Interestingly, disease-progression did not appear to have a substantial impact on recognition performance in this sample. We further discuss the opportunities and challenges of detecting emotional engagement from speech in PwD.