Domain-Incremental Learning

Updated 7 April 2026

Domain-Incremental Learning is a continual learning paradigm where a fixed label space is maintained while models sequentially adapt to shifting input distributions.
Practical methods including replay, regularization, and prompt-tuning balance new domain adaptation with the preservation of prior knowledge.
Benchmarking on diverse datasets and federated scenarios highlights trade-offs between memory use and performance, underlining challenges of catastrophic forgetting.

Domain-Incremental Learning (Domain-IL) is a continual learning paradigm in which a model encounters a sequence of domains—distinct input distributions sharing a fixed label space—over time and must incrementally adapt to each new domain without catastrophic forgetting of prior domain knowledge. In its canonical form, the learning protocol prohibits access to previous domain data after the current learning phase and requires models to preserve competence on all tasks seen so far, even as new domains are introduced. Unlike class-incremental learning, Domain-IL maintains a constant class set, focusing all difficulty on distributional shift and representation collapse. The resulting challenges and unique solution space distinguish Domain-IL as a key research direction in continual deep learning, machine vision, federated learning, and adaptive multi-domain systems.

1. Formal Definitions and Problem Setting

Domain-IL assumes a sequence of datasets $\{D_1, D_2, \dots, D_T\}$ , each associated with a distinct input distribution $P_t(x, y)$ but a shared label space $\mathcal{Y}$ . At each timestep $t$ , the model is trained only on $D_t$ , and must, at evaluation time, perform well on all previous domains $D_1, \dots, D_t$ , without access to any prior training data $D_{1:(t-1)}$ (Huang et al., 2023, Zhou et al., 2024).

Mathematically, the canonical objective at stage $t$ is: $\min_\theta\;\Bigg\{\;L_t(\theta)\;+\;\lambda\,R(\theta;\, {\theta}^{1:(t-1)})\;\Bigg\}$ where $L_t$ is the task loss on domain $P_t(x, y)$ 0, and $P_t(x, y)$ 1 constrains deviation from parameters important to prior domains (e.g., via replay, distillation, or regularization) (Churamani et al., 2021).

Evaluation employs metrics such as:

Average accuracy: $P_t(x, y)$ 2 after learning all $P_t(x, y)$ 3 domains.
Forgetting: $P_t(x, y)$ 4 (Shi et al., 2023, Luo et al., 10 Mar 2025, Mulimani et al., 2024).

Domain-IL is distinct from Task-IL (distinct label sets per task) and Class-IL (incrementally growing label sets), both by the constancy of output space and the exclusive focus on handling domain shift.

2. Algorithmic Methodologies and Paradigms

A diverse range of methodologies has been developed and empirically validated for Domain-IL:

Replay-based Methods. Episodic memory buffers store a small representative set of prior domain samples. Interleaving replayed and new-domain data during training is highly effective—often matching or nearly matching joint training—even with tiny buffers (as low as 0.5% of the prior dataset) (Kalb et al., 2022). Replay trumps EWC, LwF, and related regularization approaches in segmentation, vision, and audio (Mulimani et al., 2024). Task-agnostic extensions use online clustering to maintain a representative fixed-size memory without domain labels (Lamers et al., 2023).

Regularization-based Methods. Penalty terms constrain parameter drift from previous optima, with variants including:

EWC: Fisher information-weighted quadratic penalty (Huang et al., 2023).
SI / MAS: Synaptic or output-sensitivity based importance tracking (Churamani et al., 2021).
IMM: Quadratic weight matching with post-hoc averaging (Huang et al., 2023).

Distillation and Knowledge Transfer. Logit-prediction alignment on new-domain data using old-domain models as reference (LwF) yields significant gains in non-IID scenarios, though can misguide adaptation if domain label distributions diverge (Huang et al., 2023, Kalb et al., 2022).

Prompt-based Approaches and Modular Adaptation. Prompt-tuning and adapter-based continual learning (e.g., L2P, DualPrompt, MoP-CLIP) localize adaptation to lightweight, domain-specific modules, freezing the pretrained backbone and decoupling domains via either learned prompts or injected adapters (Nicolas et al., 2023, Park et al., 2024). Dynamic prompt banks, residual adapters, and decoupled classifier heads are recurrent motifs.

Consolidation Techniques. Bi-level switching between representation and classifier merging, as in Duct, allows explicit parameter-space tracking of each domain's adaptation vector and optimal transport based alignment of classifier heads across evolving embedding spaces (Zhou et al., 2024).

Other Strategies: Prototype-based rehearsal-free methods (DualCP), adversarial source-free adaptation (ALeN), and contrastive domain-proxy architectures are also highly competitive (Wang et al., 23 Mar 2025, Ambastha et al., 2023).

3. Empirical Advances and Benchmarking

Comprehensive benchmarking on datasets such as DomainNet, CORe50, Office-Home, Audioset, and large medical and NLP corpora has led to several robust empirical insights:

Method	Buffer Size	DomainNet A_T (%)	Forgetting F_T (%)	Notable Properties
Replay (M=32)	≈0.5%	56.0	-	Simple mixing, nearly offline UB
MoP-CLIP	0	69.7	-	Prompt-tuned CLIP, OOD robust
ICON	0	54.44	13.32	Adapter + classifier expansion
Duct	0	67.2 (IN1K)	lowest	Dual-level consolidation
DualCP	0	60.13	-1.96	Dual ETF prototypes, rehearsal-free
UDIL	<1%	82.1 (CORe50)	19.6	Adaptive loss bounding
RefFiL	0	↑ over baselines	-	Prompt-sharing FL, rehearsal-free
One-Shot BN Fix	0	95.9 (new dom)	Negligible	BN-fix needed for one-shot regime
DARE	small	37.2	-19.4 (BWT)	Gradual drift control
pFedDIL	per-client	up to +14.35%	-	Correlation-guided FDIL

Replay is dominant when memory is permissible (Kalb et al., 2022); prompt-based and consolidation strategies outperform regularization or distillation when memory is not permissible or domains are highly non-IID (Nicolas et al., 2023, Zhou et al., 2024, Wang et al., 23 Mar 2025). Newer techniques targeting rehearsal-free scenarios leverage fixed concept subspaces or prompt ensembles to close the gap to buffer-based baselines. Fine-tuning alone results in rapid catastrophic forgetting, confirming the necessity of domain-specific bias correction or knowledge preservation (Mulimani et al., 2024).

4. Specialty Scenarios: Federated and Task-Agnostic Domain-IL

In federated settings, Domain-IL must contend with non-IID clients, privacy constraints, and personalized adaptation:

RefFiL: Achieves rehearsal-free domain-IL in FL by prompt sharing and contrastive learning, fully agnostic to external memory (Sun et al., 2024).
pFedDIL: Each client adaptively selects and migrates knowledge from prior domain-specific models based on auxiliary classifier correlation, enabling per-user continual adaptation and robust ensemble inference (Li et al., 2024).
Incremental Transfer: Weight transfer protocols (SWT, CWT) and optimizer reloading facilitate practical federated domain-IL; cyclic protocols close the joint-training gap even under severe data heterogeneity (Huang et al., 2023).

Task-agnostic Domain-IL, tackling unobserved or undetectable domain boundaries, exploits clustering-based memory updates and online task-ID inference:

TA-A-GEM / TA-OGD: Cluster-based memory pools and projection steps support domain-incremental learning under total boundary uncertainty (Lamers et al., 2023).
TADIL: Lightweight embedding clustering and drift detection match oracle-level accuracy without requiring explicit task markers (Bravo-Rocca et al., 2023).

5. Mitigating Catastrophic Forgetting and Representation Drift

Preventing the erasure of past domain knowledge hinges on controlling the balance of plasticity (acquisition of new information) and stability (retention of prior representations):

Replay and Prototypes: Periodic exposure to buffered prior samples or sampled Gaussian prototypes from past domain representations prevents drift of class–domain clusters in latent space (Kalb et al., 2022, Ambastha et al., 2023).

Gradient and Representation Control: Explicit projection or recalibration of gradient steps (e.g., MiCo’s direction/magnitude modules (Luo et al., 10 Mar 2025), OGD (Lamers et al., 2023)) and buffer subsampling (e.g., Intermediary Reservoir Sampling in DARE (Jeeveswaran et al., 2024)) limit representation collapse at task boundaries.

Dual-level Alignment: Prototypical constraints at both coarse (superordinate) and fine (subordinate) semantic levels, as in DualCP, improve inter-class separation and preserve old-domain manifolds without rehearsal (Wang et al., 23 Mar 2025).

BatchNorm Specialization: For extreme low-shot adaptation, freezing pre-adaptation batch-normalization statistics arrests normalization drift—a primary culprit in catastrophic forgetting for one-shot DIL (Esaki et al., 2024).

Classifier and Representation Consolidation: Bi-level merging of parameter trajectories (Duct), augmented by optimal transport of classifier heads to align legacy and new-domain weights, facilitates unified cross-domain posteriors and eliminates classifier misalignment after embedding consolidation (Zhou et al., 2024).

6. Broader Applications and Theoretical Perspectives

Domain-IL has catalyzed progress beyond academic benchmarks:

Bias mitigation in demographic-sensitive recognition: Framing fair learning as a Domain-IL problem enables balanced accuracy/fairness trade-offs even under severe demographic skew (Churamani et al., 2021).
Medical and collaborative learning: Peer-to-peer and federated Domain-IL closes the generalization gap under privacy constraints and site-specific domain shifts (Huang et al., 2023).
Audio, video, and NLP: Dynamic per-domain parameterization (e.g., domain-specific BN and classifier heads for audio (Mulimani et al., 2024), replay for video action recognition (Hu et al., 2024), open-world prompt selection for NLP (Dai et al., 2023)) enables high plasticity and stability across modalities.
Open-world and OOD-robustness: Prompt ensembles and prototype-mixing (MoP-CLIP) and explicit unseen-domain handling yield state-of-the-art OOD accuracy (Nicolas et al., 2023).

Recent theoretical analysis (UDIL) demonstrates that all major replay/distillation-based approaches correspond to fixed-coefficient versions of a unified upper bound; adaptively learning these weights achieves a strictly tighter risk bound in all memory regimes (Shi et al., 2023).

7. Open Problems and Outlook

Key challenges remain:

Memory constraints: Many high-performing methods require domain exemplars or active buffers; rehearsal-free strategies are an active area of development (Sun et al., 2024, Wang et al., 23 Mar 2025).
Task-agnostic, streaming scenarios: Identifying domain boundaries, adapting under delayed feedback, or supporting open-ended task sets continue to require scalable, lightweight protocols (Lamers et al., 2023, Bravo-Rocca et al., 2023).
Continual adaptation versus representation drift: Robustness to compounding domain shifts, sudden or gradual, remains fragile for dense or highly nonlinear input spaces (Jeeveswaran et al., 2024).
Integration with foundation/vision-LLMs: Parameter-efficient prompt-based and consolidation techniques are rapidly scaling to very-large models, with ongoing work targeting end-to-end adaptation, subspace management, and continual pretraining for universal encoders (Zhou et al., 2024, Nicolas et al., 2023).

Domain-Incremental Learning has emerged as a critical testbed and application scenario for strong continual learning, with replay, prompt modularity, and consolidation methods now converging on practical solutions for real-world heterogeneous, distributed, and privacy-sensitive deployments.