MT-SLVR: Multi-Task Self-Supervised Learning for Transformation In(Variant) Representations

Heggan, Calum; Hospedales, Tim; Budgett, Sam; Yaghoobi, Mehrdad

doi:10.21437/Interspeech.2023-1064

MT-SLVR: Multi-Task Self-Supervised Learning for Transformation In(Variant) Representations

Calum Heggan, Tim Hospedales, Sam Budgett, Mehrdad Yaghoobi

Contrastive self-supervised learning has gained attention for its ability to create high-quality representations from large unlabelled data sets. A key reason that these powerful features enable data-efficient learning of downstream tasks is that they provide augmentation invariance, which is often a useful inductive bias. However, the amount and type of invariances preferred is not known apriori, and varies across different downstream tasks. We therefore propose a multi-task self-supervised framework (MT-SLVR) that learns both variant and invariant features in a parameter-efficient manner. Our multi-task representation provides a strong and flexible feature that benefits diverse downstream tasks. We evaluate our approach on few-shot classification tasks drawn from a variety of audio domains and demonstrate improved classification performance on all of them.

doi: 10.21437/Interspeech.2023-1064

Cite as: Heggan, C., Hospedales, T., Budgett, S., Yaghoobi, M. (2023) MT-SLVR: Multi-Task Self-Supervised Learning for Transformation In(Variant) Representations. Proc. INTERSPEECH 2023, 4399-4403, doi: 10.21437/Interspeech.2023-1064

@inproceedings{heggan23_interspeech,
  author={Calum Heggan and Tim Hospedales and Sam Budgett and Mehrdad Yaghoobi},
  title={{MT-SLVR: Multi-Task Self-Supervised Learning for Transformation In(Variant) Representations}},
  year=2023,
  booktitle={Proc. INTERSPEECH 2023},
  pages={4399--4403},
  doi={10.21437/Interspeech.2023-1064}
}