Self-supervised learning (SSL) speech representations learnedfrom large amounts of diverse, mixed-quality speech datawithout transcriptions are gaining ground in many speech-technology applications. Prior work has shown that SSL isan effective intermediate representation in two-stage text-to-speech (TTS) for both read and spontaneous speech. However, it is still not clear which SSL and which layer from eachSSL model is most suited for spontaneous TTS. We address thisshortcoming by extending the scope of comparison for SSL inspontaneous TTS to 6 different SSLs and 3 layers within eachSSL. Furthermore, SSL has also shown potential in predictingthe mean opinion scores (MOS) of synthesized speech, but thishas only been done in read-speech MOS prediction. We extendan SSL-based MOS prediction framework previously developedfor scoring read speech synthesis and evaluate its performanceon synthesized spontaneous speech. All experiments are conducted twice on two different spontaneous corpora in order tofind generalizable trends. Overall, we present comprehensiveexperimental results on the use of SSL in spontaneous TTS andMOS prediction to further quantify and understand how SSLcan be used in spontaneous TTS. Audios samples: https://www.speech.kth.se/tts-demos/sp_ssl_tts.