Audio Question Answering (AQA) is a multi-modal translation task where a system analyzes an audio signal and a natural language question to generate a desirable natural language answer. AQA has been primarily studied through the lens of the English language. However, addressing AQA in other languages, in the same manner, would require a considerable amount of resources. This paper proposes scalable solutions to multi-lingual audio question answering on both data and modeling fronts. We propose mClothoAQA, a translation-based multi-lingual AQA dataset in eight languages. The dataset consists of 1991 audio files and nearly 0.3 million question-answer pairs. Finally, we introduce a multi-lingual AQA model and demonstrate its strong performance in eight languages. The dataset and code can be accessed at https://github.com/swarupbehera/mAQA.