Plasticity Ceiling Framework

Updated 19 December 2025

Plasticity-Ceiling Framework is a model that defines an agent’s adaptive capacity by quantifying learning limits using information theory and empirical metrics.
It operationalizes key concepts through NTK analysis, deep RL performance curves, and LLM post-training evaluations to monitor adaptation.
The framework unites continual learning, deep RL, and LLM methodologies to provide actionable guidelines for managing and extending adaptive performance.

The plasticity-ceiling framework establishes rigorous theoretical and empirical bounds on an agent’s or neural system’s capacity to learn, adapt, and control in both artificial and natural settings. “Plasticity” quantifies the system’s ability to update its internal state or policy in response to new data, while the “ceiling” denotes the intrinsic or emergent limit, defined by information-theoretic, architectural, or dynamical constraints, beyond which adaptive capacity collapses. This framework unites perspectives from continual learning theory, information-theoretic agency, deep RL, and LLM post-training, providing operational criteria, computable metrics, and actionable guidelines for preserving and maximizing adaptive performance.

1. Formal Definitions and Theoretical Foundations

At the core of the plasticity-ceiling concept lies a precise mathematical definition of plasticity and its coupling to system limits.

In information-theoretic agency, plasticity and empowerment are dual capacities, formalized via generalized directed information (GDI). For a discrete-time agent with action sequence $A_{1:n}$ and observations $S_{1:n}$ , the GDI from observations to actions over arbitrary intervals $[a:b]\to[c:d]$ is

$I_{\mathrm{GDI}}(S_{a:b}\rightarrow A_{c:d}) = \sum_{t=\max(a,c)}^{d} I\left(S_{a:\min(b,t)}; A_t | A_{1:t-1}, S_{1:t-1}\right),$

where $I(\cdot;\cdot|\cdot)$ denotes conditional mutual information. Plasticity is then defined as the maximum GDI from environment to agent, and empowerment as the dual (Abel et al., 15 May 2025).

Plasticity is tightly bounded: $E_{a:b\to c:d}(\pi,e) + P_{a:b\to c:d}(\pi,e) \leq m,$ where $m = \min\{(b-a)\log|O|, (d-c)\log|A|\}$ is a ceiling on total information transfer, determined by the cardinalities of observation ( $O$ ) and action ( $A$ ) spaces and chosen time windows.

In continual learning, the framework is realized concretely using the Neural Tangent Kernel (NTK). For a neural network $f_\theta$ , the empirical NTK on a dataset $\{x_i\}_{i=1}^n$ is

$K_{ij} = \nabla_\theta f_\theta(x_i)\cdot\nabla_\theta f_\theta(x_j),$

with eigendecomposition $\lambda_1\geq\cdots\geq\lambda_n>0$ (Shit, 3 Nov 2025). The effective learning directions are quantified via the “effective rank”: $\mathrm{ERank}(K) = \sum_{i=1}^n \mathbf{1}(\lambda_i > 0.01\lambda_{\max}),$ and the adaptive parameter fraction must satisfy

$p_{\min} = \max\Bigl(0.10,\, \frac{\mathrm{ERank}(K)}{n}\Bigr),$

with at least 10% of parameters remaining trainable—a hard plasticity ceiling. When the NTK condition number

$\kappa(K) = \frac{\lambda_{\max}}{\lambda_{\min}}$

reaches $10^{11}$ , learning collapses as the smallest eigen-directions vanish, operationalizing the system’s capacity limit.

In deep RL, the plasticity ceiling $\mathcal{C}$ is empirically the supremum non-zero rate of return or gradient norm over time: $\mathcal{C} = \sup_{t\in[0,T]}\left\{\mathcal{P}_R(t),\,\mathcal{P}_G(t)\right\},$ where $\mathcal{P}_R(t) = dR(t)/dt$ and $\mathcal{P}_G(t) = dG(t)/dt$ for return $R(t)$ and global gradient norm $G(t)$ (Yuan et al., 24 Apr 2025).

In LLM post-training, the plasticity-ceiling framework decomposes achievable post-training performance as

$P_{\mathrm{post}}(x_{\rm sft}, x_{\rm rl}) = P_{\rm sft}(x_{\rm sft}) + \frac{A_{\mathrm{post}}-P_{\rm sft}(x_{\rm sft})}{1+(x_{\rm rl}/C_{\rm mid,rl})^{-B_{\rm rl}}},$

with $P_{\rm sft}$ representing foundation established by supervised fine-tuning (SFT), $A_{\mathrm{post}}$ being the asymptotic performance ceiling, and $PL_{\rm rl} = A_{\mathrm{post}} - P_{\rm sft}$ quantifying remaining adaptation headroom for RL (Ding et al., 12 Dec 2025).

2. Operationalization and Metrics

Empirical measurement and intervention within the plasticity-ceiling framework require both rigorous metrics and protocolized procedures.

Plasticine (Yuan et al., 24 Apr 2025) provides a comprehensive metric suite for deep RL plasticity loss:

Ratio of Dormant Units (RDU)
Fraction of Active Units (FAU)
Stable Rank (SR) and Effective Rank (ER) of the feature matrix
Weight Magnitude (WM) and Difference (WD)
Feature Norm (FN) and Variance (FV)
Gradient Norm (GN)
Policy Entropy (PE)

Combining these, the system’s instantaneous capacity for further adaptation is monitored, with vanishing $\mathcal{P}_R$ and $\mathcal{P}_G$ marking arrival at the plasticity ceiling.

In continual learning, selection of paths to freeze is governed by Wilson confidence intervals, ensuring only statistically-validated, high-quality parameter paths reduce plasticity, with the lower bound $CI_{\mathrm{lower}}$ ensuring $CI_{\mathrm{lower}}\geq0.5$ for acceptance; empirically about 80% of discovered paths meet this threshold (Shit, 3 Nov 2025).

Path quality is measured via a weighted sum of five metrics: task performance improvement, stability, gradient importance, activation magnitude, and recency, ensuring only the most robust updates push against the plasticity ceiling.

In LLM post-training, the key empirical predictors for the ceiling include the minimum SFT validation loss (strong $r\approx-0.90$ correlation with attained $A_{\mathrm{post}}$ ), data scale (primary driver), and trajectory difficulty (multiplier effect but not a substitute for scale) (Ding et al., 12 Dec 2025).

3. Scaling Laws, Transitions, and Ceiling Effects

The dynamic approach to the plasticity ceiling is governed by scaling laws and threshold events.

In LLM post-training, final performance can be decomposed per the sigmoidal power law, and optimal transitions from SFT to RL maximize $P_{\rm sft}$ without collapsing $PL_{\rm rl}$ . Empirical rules are:

Transition to RL at or near the global minimum of SFT validation loss, specifically within a stable or mild overfitting window (tolerances of $+2\%$ or $+10\%$ over the minimum),
Avoid transitions during severe overfitting ( $\geq+10\%$ above minimum), which restrict $A_{\mathrm{post}}$ due to collapsed RL plasticity,
Prefer large, diverse SFT corpora, as data scale strictly determines maximum potential, with difficulty offering only a multiplicative effect (Ding et al., 12 Dec 2025).

In NTK-coordinated continual learning, once the condition number $\kappa(K)>10^{11}$ , only the top 1% eigendirections retain learning capacity, locking $p_{\min}$ at its lower bound and fully activating the plasticity ceiling (Shit, 3 Nov 2025). Capacity expansion (e.g., network growth) becomes necessary to regain plasticity.

In deep RL, architecture and regularization interventions (resets, normalization, weight/projective constraints) delay the onset of the ceiling but do not remove it. Only a combination of methods (e.g., Deep Fourier Features and TRAC optimizer) substantially extend adaptive horizons in open-ended environments (Yuan et al., 24 Apr 2025).

4. Practical Algorithms and Adaptive Interventions

The plasticity-ceiling framework prescribes explicit adaptive algorithms to preserve or extend adaptive capacity.

In NTK-guided continual learning (Shit, 3 Nov 2025):

At each task, recompute the NTK, effective rank, and plasticity bound $p_{\min}$ ;
Apply freeze-protection only to statistically-validated high-quality paths;
Enforce that at least 10% of parameters remain adaptive at all times;
Detect capacity exhaustion ( $\kappa>10^{11}$ ) and trigger expansion (e.g., adding network layers);
Use adaptive gradient masking, with $\alpha_j = \mathrm{clip}((1-p_{\min})\cdot\mathrm{Importance}_j, 0, 0.98)$ , to scale learning flexibility per-parameter.

Plasticine’s single-file method interface (Yuan et al., 24 Apr 2025) allows plug-and-play application of 13+ interventions, systematic logging, and benchmarking under varying levels of non-stationarity, supporting both continual and open-ended adaptation studies.

In LLM post-training (Ding et al., 12 Dec 2025), the decision flow is:

Gather a large SFT corpus (ideally $>$ 100K examples),
Monitor SFT validation loss, identify stable checkpoint $x_{\rm sft}^*$ ,
If minimum validation loss is above target, iterate data collection,
Switch to RL at $x_{\rm sft}^*$ , ensuring maximal RL plasticity is preserved.

5. Empirical Performance and Ceiling-Pushing Benchmarks

Representative findings highlight both the diagnostic and prescriptive power of the framework.

In continual learning (Split-CIFAR10), path-coordinated NTK-based methods yield $66.7\%$ average accuracy with $23.4\%$ forgetting, outperforming baselines. The system exhibits stabilization, with forgetting dropping from $27\%$ to $18\%$ as tasks progress, reflecting successful plasticity management (Shit, 3 Nov 2025).

Plasticine benchmarks show that in standard RL, normalization methods maintain gradient norm and delay plasticity loss; in continual RL, reset-based interventions maintain higher retention; in open-ended settings, only advanced methods prolong non-zero adaptation, with all methods eventually plateauing near the empirical plasticity ceiling (Yuan et al., 24 Apr 2025). See the following summary table (compiled from (Yuan et al., 24 Apr 2025)):

Scenario	Vanilla Return	Best Return	GN Ratio at End
Standard-Procgen	0.33 ± 0.05	0.72 (NaP)	0.62 ± 0.08
Continual-DMC	0.18 ± 0.02	0.54 (ReDo)	0.48 ± 0.06
Open-Craftax	0.05 ± 0.01	0.22 (Fourier+TRAC)	0.12 ± 0.03

In LLM post-training, sequential SFT $\to$ RL achieves highest performance ceilings (e.g., 78.1 vs. 71–73 for pure-RL or synchronized SFT-RL, and $r=-0.90$ between min SFT loss and $A_{\mathrm{post}}$ ), directly validating the framework’s guidelines (Ding et al., 12 Dec 2025).

6. Design Principles, Trade-offs, and Theoretical Implications

The plasticity-ceiling framework imposes quantitative design constraints and reveals fundamental trade-offs:

In agent-environment systems, maximizing plasticity or empowerment in one direction limits the other; the achievable sum is bounded by channel and time-window capacity (Abel et al., 15 May 2025).
Neural architectures must balance stability with maintained adaptation headroom; excessive freezing or over-regularization risks falling below the plasticity floor and irreversibly stalling learning (Shit, 3 Nov 2025).
In practical long-horizon RL and LLMs, interventions that delay or expand the ceiling (meta-learning when to reset, normalize, or grow the model) become central to scalable lifelong adaptation (Yuan et al., 24 Apr 2025).
Predictive monitoring (NTK condition number, validation loss minima) enables preemptive action before hitting the adaptive limit.

A plausible implication is that adaptive systems across AI domains—whether neural, symbolic, or embodied—converge to a common information-constrained trade-off surface, requiring principled balancing acts to optimize both learning sensitivity and control authority.

7. Extensions and Future Directions

Several research directions are immediate consequences of the plasticity-ceiling paradigm:

Development of meta-adaptive algorithms that learn to expand or shift the plasticity ceiling on the fly, for open-world and unsupervised continual learning (Yuan et al., 24 Apr 2025).
Deeper integration of NTK or information-theoretic diagnostics into training loops for early warning and automated intervention (Shit, 3 Nov 2025).
Application of the plasticity-ceiling formalism to broader classes of agents, including modular, hierarchical, or distributed systems (Abel et al., 15 May 2025).
Ongoing empirical work refining and expanding benchmarking scenarios, particularly in the context of persistent, open-ended novelty and robustness to adversarial non-stationarity (Yuan et al., 24 Apr 2025).
Investigation of plasticity-ceiling-aware curriculum design and data selection strategies, especially in large-scale LLM post-training (Ding et al., 12 Dec 2025).

The plasticity-ceiling framework thus offers a unifying quantitative lens for the design, evaluation, and extension of adaptive systems across machine learning and artificial intelligence.