Sound scene geotagging is a new topic of research which has evolved
from acoustic scene classification. It is motivated by the idea of
audio surveillance. Not content with only describing a scene in a recording,
a machine which can locate where the recording was captured would be
of use to many. In this paper we explore a series of common audio data
augmentation methods to evaluate which best improves the accuracy of
audio geotagging classifiers.
Our work improves
on the state-of-the-art city geotagging method by 23% in terms of classification
accuracy.