Domain Continual Learning Framework

Updated 16 January 2026

Domain continual learning is a framework that incrementally trains models on sequential, varying domains while effectively mitigating catastrophic forgetting.
It employs techniques such as safe policy adaptation, dynamic expert expansion, and generative replay to manage domain shifts and preserve previous knowledge.
Empirical evaluations show improved robustness in robotics, visual tasks, and dialogue systems through integrated architectures designed for multi-domain learning.

A domain continual learning framework enables machine learning models to incrementally acquire knowledge across a sequence of data distributions—each considered a domain—while actively retaining previously acquired skills and minimizing catastrophic forgetting. These frameworks address realistic non-stationarity in data sources, label spaces, system parameters, or underlying dynamics, and extend standard continual learning by explicitly managing cross-domain transfer, robustness to domain shifts, and memory-efficient retention mechanisms. Recent advances encompass safe policy adaptation for robotics, multi-source incremental learning, generative concept embedding, unsupervised continual adaptation, and integrated architectures for reinforcement, supervised, and self-supervised learning across domains.

1. Formal Definition and Scope

Domain continual learning seeks to learn from a temporally ordered series of domains $\{\mathcal{D}_0, \mathcal{D}_1, ..., \mathcal{D}_T\}$ , where each domain may induce variations in input distribution, label space, system dynamics, or observation modalities. Unlike traditional continual learning frameworks, the domain continual setting focuses on the ability to generalize across drifted, unseen domains (domain generalization), adapt online to new domain characteristics (domain adaptation), and mitigate forgetting of previous domain-specific knowledge. Tasks are formalized as:

Supervised domain sequence: $\mathcal{D}_t = \{(x_t^{(i)}, y_t^{(i)})\}_{i=1}^{N_t}$ , possibly with latent or explicit domain variation $z_t$ .
Unsupervised/targeted domain sequence: $\mathcal{D}_t = \{x_t^{(i)}\}_{i=1}^{N_t}$ , label-free but subject to domain drift.
Reinforcement learning with domain dynamics: State−action −reward tuples sampled from domain-specific transition kernels $P^{(t)}$ with evolving system parameters.

The central objective is to maximize cumulative performance over all domains while subject to resource constraints (e.g., limited memory, fixed model capacity) and explicit constraints on safety or stability (as in robotic control) (Josifovski et al., 13 Mar 2025, Wu et al., 15 Jan 2025, Yan et al., 19 Oct 2025, Cho et al., 2023).

2. Architectural Paradigms and Key Algorithms

Domain continual learning frameworks employ varied architectural approaches tailored to the specific challenges of domain drift, catastrophic forgetting, and cross-domain generalization:

A. Safe Continual Domain Adaptation (Safe-CDA)

Safe-CDA (Josifovski et al., 13 Mar 2025) formalizes control as a constrained Markov decision process (CMDP), optimizing for expected cumulative reward while enforcing safety cost bounds. The pipeline:

Pretraining in randomized simulation: Wide-ranging parameter randomization in simulators generates robust policies via Projected Constrained RPO (PCRPO).
Deployment-time continual adaptation: Fisher information computed from simulation underpins an Elastic Weight Consolidation (EWC) regularization for online updates, preventing forgetting and hazardous policy drift by penalizing deviation from critical weights.
Safety critic and cost filters: Real-world trajectories further update value/network parameters, with PCRPO delineating ‘safe,’ ‘soft-violation,’ and ‘violation’ regimes for constrained policy updates.

B. Dynamic Expansion and Expert Routing

The Multi-Source Dynamic Expansion Model (MSDEM) (Wu et al., 15 Jan 2025) introduces “expert” modules for each new domain/task, leveraging:

Multiple frozen backbones: Diversity of domain priors across pretrained networks.
Task-specific dynamic attention: DEAM selects relevant backbone features per task.
Dynamic Graph Weight Router (DGWR): Learns directed usefulness between experts, facilitates positive transfer, and preserves prior capabilities via frozen graph structure.

C. Replay and Generative Modeling

Generative Continual Concept Learning (Rostami et al., 2019) employs a PDP–CLS (Parallel Distributed Processing–Complementary Learning Systems) architecture:

Latent embedding space coupling: Autoencoder–classifier uncovers domain-invariant concepts.
GMM-based generative replay: Pseudo-examples generated and replayed to consolidate concept formation and mitigate forgetting, even under few-shot domain shift.

D. Unsupervised Continual Domain Adaptation and Generalization

The CoDAG framework (Cho et al., 2023) merges:

Dual network architecture: Separate domain adaptation (DA) and domain generalization (DG) branches with interleaved pseudo-labeling and ERM losses.
Replay buffer with distillation: Balanced memory sampling across domains, pseudo-label refinement, and mutual knowledge transfer between branches.

E. Memory-Efficient Adapter and Residual Architectures

AdapterCL for dialogue systems (Madotto et al., 2020) and LoRA+Gating+Heads for ViT (Hedjazi et al., 11 Apr 2025) adopt parameter-efficient residual adapters or domain-specific LoRA modules:

Freezing base weights: Isolation of domain-specific learning into lightweight, non-interfering adapters.
Per-domain output heads/gating: Feature-level gating and head isolation curtail cross-domain contamination, supporting long-term knowledge retention.

3. Core Methodological Components

A. Domain Randomization and Drift Modeling

Wide parameter randomization during pretraining creates a robust envelope for real-world adaptation, but Safe-CDA demonstrates that continual adaptation must still account for in-environment drift using safe RL protocols and EWC regularization (Josifovski et al., 13 Mar 2025).

B. Catastrophic Forgetting Avoidance

Regularization: EWC or Fisher-based penalties anchor important weights (Josifovski et al., 13 Mar 2025, Wang et al., 2024).
Replay memory: Balanced, centroid-focused rehearsal buffers and generative replay prevent feature space collapse (Cho et al., 2023, Lyu et al., 2024, Rostami et al., 2019, Toldo et al., 2022).
Adapter Isolation: Freezing prior adapters or branches and using domain-aware routing blocks direct parameter overwriting (Hedjazi et al., 11 Apr 2025, Madotto et al., 2020, Tang et al., 2024).

C. Cross-Domain Feature Disentanglement

Methods such as DoT (Yan et al., 19 Oct 2025) use transformer-layerwise decomposition to separate semantic from domain-specific information, allowing synthesis of pseudo-features for unseen domains and explicit classifier realignment through attentive transformation.

D. Dynamic Expert Selection and Expansion

Dynamic expert expansion (MSDEM) (Wu et al., 15 Jan 2025) and Target-specific Memory expansion (ETM) (Kim et al., 2020) allow specialized adaptation while retaining prior expertise. Graph routers (DGWR) orchestrate hierarchical transfer and selection among experts.

4. Representative Algorithms and Pseudocode

Below, a representative Safe-CDA continual learning algorithm:

for pretrain_iter in range(N_pre):
    # randomize domain parameters in simulation
    rollouts = collect_randomized_trajectories(...)
    update_policy_PCRPO(rollouts)
    update_critics(rollouts)

F_i = estimate_fisher_information(...)

while deployed:
    episodes = collect_real_world_trajectories(...)
    update_critics(episodes)
    compute_loss = PCRPO_loss + (lambda/2) * sum_i F_i * (rho_i - rho_tilde_i)^2
    gradient_step(policy_params, compute_loss)

For MSDEM (expert expansion and dynamic routing):

for task_t in range(T):
    instantiate_new_expert()
    build_deam_attention()
    build_dgwr_router_weights()
    for batch in task_t_data:
        forward_pass_through_backbones
        apply_deam_and_router
        compute_cross_entropy_loss
        update_new_expert_params

5. Evaluation Benchmarks and Empirical Performance

Robotics (Safe-CDA): Real-world adaptation shows reduced safety violations, faster adaptation, retention of general policies, and task-agnostic grasping success rates doubled to 60% in the real robot (Josifovski et al., 13 Mar 2025).
Visual Incremental Learning (MSDEM): Dual- and multi-domain streams demonstrate 1‒15% average accuracy improvement over prompting and mixture-of-experts; fastest convergence and lowest computation cost (Wu et al., 15 Jan 2025).
Semantic Segmentation (LwS, ETM): Style transfer–distillation synergy achieves robust cross-domain class transfer and minimal forgetting; state-of-the-art mIoU on Cityscapes/IDD/Mapillary (Toldo et al., 2022, Kim et al., 2020).
Dialogue Systems (AdapterCL, Replay): Adapters and replay buffer both provide strong performance, but multi-task training remains superior when all data is available simultaneously (Madotto et al., 2020).
Incremental Learning with Domain-Aware Components: von Mises–Fisher mixture expansion and bi-level memory sampling deliver significant accuracy and forgetting gains on iCIFAR-20, iDomainNet, and iDigits (Xie et al., 2022).

6. Open Challenges and Future Directions

Key unresolved directions in domain continual learning include:

Explicit domain shift detection and out-of-envelope adaptation (current methods often assume drift remains within initial simulation/randomization).
Scalable memory and parameter management (dynamic expansion vs adapter isolation vs buffer replay vs generative replay).
Efficient cross-domain generalization across unseen or rare domains (few-shot, zero-shot, or hard domain boundaries).
Unified theoretical guarantees: While linear-feature continual learning admits efficient and no-forgetting solutions, deep nonlinear representations remain hindered by information-theoretic barriers—requiring replay, expansion, or improper learning strategies (Peng et al., 2022).

The field advances through the synthesis of safe RL, dynamic expansion, generative memory, feature disentanglement, memory replay, and parameter-efficient fine-tuning into integrated domain continual learning architectures, with empirical validation spanning robotics, vision, dialogue, recommendation, and medical imaging (Josifovski et al., 13 Mar 2025, Wu et al., 15 Jan 2025, Yan et al., 19 Oct 2025, Rostami et al., 2019, Toldo et al., 2022, Hedjazi et al., 11 Apr 2025, Cho et al., 2023, Kim et al., 2020, Lyu et al., 2024, Wang et al., 2024, Xie et al., 2022, Madotto et al., 2020, Tang et al., 2024, Tasai et al., 31 Oct 2025).