Learning continually from data is a task executed effortlessly by humans but remains to be of significant challenge for machines. Moreover, when encountering unknown test scenarios machines fail to generalize. We propose a mathematically motivated dynamically expanding end-to-end model of independent sequence-to-sequence components trained on different data sets that avoid catastrophically forgetting knowledge acquired from previously seen data while seamlessly integrating knowledge from new data. During inference, the likelihoods of the unknown test scenario are computed using internal model activation distributions. The inference made by each independent component is weighted by the normalized likelihood values to obtain the final decision.