We present hierarchical recursive CTC (HR-CTC), an effective hierarchical multi-task learning (HMTL) model for end-to-end automatic speech recognition (ASR). HMTL enables a model to learn suitable intermediate representations for predicting high-level and sparse targets (e.g., words). This is achieved by applying auxiliary CTC losses to intermediate layers of the model, which are calculated using lower-level targets (e.g., phonemes or smaller subwords). In this work, we propose to enhance the hierarchical generation capability in HMTL by designing a recursive structure that iteratively uses the same model layers to refine intermediate predictions. These improved predictions are used to explicitly condition the deeper model layers, thereby facilitating more accurate predictions at the higher level. Experimental results show that HR-CTC outperforms conventional HMTL models across various ASR tasks, providing an additional benefit of balancing accuracy and inference speed.