Mamba, a state space model-based architecture, is emerging as a strong alternative to Transformer models, showing equal or superior performance in sequence generation, including speech. However, analyses have focused mainly on high-resource scenarios. This paper explores Mamba’s potential in ASR for low-resource scenarios. We compare the Transformer-based Conformer and its state-space counterpart, ConMamba, across nine languages with varying training data. Our results show that ConMamba achieves similar WER to Conformer for short-context inputs but significantly improves performance on long-context inputs, reducing WER by up to 50% on average. Additionally, ConMamba enhances efficiency, requiring 40–45% less training time, using 50% less memory, and accelerating inference by 63–70%, making it a more effective ASR solution across different data availability scenarios.