In this paper we propose acoustic direction of arrival (DOA) estimation with neural networks. Conventional signal processing tasks such as DOA estimation have benefited from recent advancements in deep learning, which leads to a data-driven approach that allows neural networks to be employed in a black-box manner. From traditional aspects, modern network models often lack interpretability when directly employed in signal processing realm. As an alternative, we introduce a learnable network from spatial acoustical DOA estimation. Convolutional variants on feature projection can be derived while maintaining the explainability in both acoustical and neural network aspects. We introduce factorized spatial-temporal-spectral filtering which can significantly reduce computational cost and memory footprint. Experiments show the proposed networks perform well in harsh acoustic conditions with reduced requirement for hardware resources.