We consider the problem of robust watermarking of speech signals using the spread spectrum method. To date, it has primarily been applied to music signals. Here we discuss differences between speech and music, and the implications this has on the use of spread spectrum watermarking. Moreover, we propose enhancements to the watermarking of speech for the detection of deepfake attacks at call centres using classical signal processing techniques and deep learning.