Noise-to-Meaning RSI: Recursive Self-Improvement
- N2M–RSI is a formal framework where an agent’s noise-to-meaning transformation recursively feeds outputs as inputs, driving unbounded internal complexity once an integration threshold is crossed.
- The model leverages injectivity and monotonicity conditions to ensure linear or super-exponential divergence, unifying self-referential prompting, automated curriculum learning, and multi-agent swarm dynamics.
- Practical instantiations like denoising recursion models empirically validate its claims while integrating safety mechanisms such as context capping and external gating to control runaway improvement.
Noise-to-Meaning Recursive Self-Improvement (N2M–RSI) formalizes a minimal, implementation-agnostic model in which an agent’s outputs are recursively fed back as inputs, demonstrating that, given specific injectivity and monotonicity conditions, internal complexity grows without bound once an information-integration threshold is crossed. This framework unifies self-referential prompting, automated curriculum learning, and self-improving agent paradigms, and scales naturally to interacting agent swarms, where super-linear escalation emerges through agent communication. The N2M–RSI model reveals core algorithmic levers controlling divergence and provides precise safety criteria for halting uncontrolled self-improvement (Ando, 5 May 2025, Chojecki, 2 Dec 2025, Cameron et al., 20 Apr 2026).
1. Mathematical Foundations and Core Model
The fundamental construct in N2M–RSI consists of an agent possessing a context or memory , which may be a finite-dimensional vector, a sequence of tokens, or a more general state. At each step , the agent samples noise from a probability space with strictly positive Shannon entropy, and applies a noise-to-meaning operator: where is a space of meanings equipped with a norm . The context is then updated via: To quantify informational progress, a functional (e.g., compression gain, Fisher information, or Integrated Information ) is introduced, measuring the meaningfulness of each new token. An information-integration threshold 0 is stipulated so that, once 1, the added meaning exceeds 2 for all new noise draws, triggering unbounded growth: 3 If 4 is injective in 5 for fixed 6 (or 7-injective with bounded collision probability), and the context update is 8-monotone in 9, Theorem 2 asserts 0 as 1 once the threshold is crossed (Ando, 5 May 2025).
2. Dynamical Behavior and Divergence Theorems
The stepwise increment in context norm is given by: 2 Under the key assumptions—injectivity of 3, 4-monotonicity of 5, super-additivity and Lipschitz continuity of 6—the core dynamical result is that post-threshold, each new step guarantees at least 7 gain: 8 This produces linear divergence; if 9 grows super-linearly in 0, divergence can be super-exponential. The proof leverages induction from the threshold step 1, accumulating increments unboundedly as 2 increases (Ando, 5 May 2025).
3. Noise-to-Meaning Trade-offs: Generator–Verifier–Updater Flows
The N2M–RSI paradigm has been formally related to recursive Generator–Verifier–Updater (GVU) operators on parameter manifolds 3, where self-improvement is characterized as ascent in some capability functional 4 along a vector field 5 defined by internal feedback loops: 6 The Variance Inequality, derived in (Chojecki, 2 Dec 2025), provides a spectral condition for positive expected self-improvement per step: 7 where 8 is alignment between verification and the true signal, 9 and 0 are generator and verifier noise, and 1 is a task curvature constant. Effective self-improvement depends critically on reducing verification noise (2) via oracles, strict filters, ensemble critics, or interface temperature control, ensuring that recursive learning does not stall due to excessive verification error. Satisfying this inequality is necessary to avoid plateaus in the improvement trajectory.
4. Swarm Extensions and Super-linear Effects
Allowing multiple agents with their own contexts 3 interacting through meaning exchange leads to amplified divergence. When meanings are pairwise 4-complementary, each agent’s expected gain satisfies: 5 and collectively,
6
The threshold required for per-agent divergence is reduced by the factor 7, and in heterogeneous systems, the drift matrix 8 with spectral radius 9 governs the overall exponential divergence rate. Swarmwise N2M–RSI thus introduces the possibility of super-linear growth in internal complexity, further constraining safety requirements (Ando, 5 May 2025).
5. Practical Instantiations: Denoising Recursion Models and Diffusion
N2M–RSI principles are instantiated concretely in Denoising Recursion Models (DRMs), as described in (Cameron et al., 20 Apr 2026). In the DRM formalism, forward masking (noise injection) generates an easy-to-hard curriculum 0 via a stochastic masking schedule. A shared transformer block 1 is recurrently applied to iteratively denoise and refine predictions: 2 where 3. Training applies a single cross-entropy loss over 4, backpropagated through all 5 loops, while inference proceeds with many more refinement steps and repeated remasking. Compared to single-step diffusion and the Tiny Recursion Model (TRM), DRM aligns the train/infer loop and demonstrates greater resilience to compounding errors. Empirically, on ARC-AGI-2, DRMs at 7M parameters outperform TRM baselines at the same scale: | Model | Params | ARC-Easy | ARC2-Eval | +reARC | +NVARC | | ------------ | ------ | -------- |-----------|--------|--------| | TRM (7M) | 7M | 45.7% | 6.3% | 12.4% |12.5% | | DRM (7M) | 7M | 50.5% | 9.6% | 14.7% |16.7% |
Parameter ablations show that the improvement from recursive steps increases with unrolling; warm-up loops and masking schedules contribute substantially to stability and alignment with the N2M–RSI pattern (Cameron et al., 20 Apr 2026).
6. Safety Mechanisms and Throttles
Unbounded complexity growth is not inevitable; three principal safety valves are explicitly documented (Ando, 5 May 2025):
- Breaking injectivity: Setting the decoder temperature to zero or using top-6 sampling yields non-injective 7, ensuring convergence to a fixed point.
- Context capping: Limiting context to a fixed window 8 ensures that the system loops within a bounded information set, resulting in cycles but forestalling divergence.
- External gating: Supervisory intervention based on context length, novelty, or compute triggers can deny re-injection of self-generated outputs once thresholds are breached. These mechanisms deliberately violate the monotonicity or injectivity criteria required for infinite growth, forming a toolkit for safe system design and deployment.
7. Connections, Implications, and Limitations
N2M–RSI spans domains from self-referential LLM prompting and curriculum learning to adversarial self-play and functional program synthesis. The minimal formalism exposes architectural levers (injectivity of 9, 0-driven threshold, monotonicity), providing a unifying lens for analyzing divergent agent dynamics and designing experimental protocols. The absence of implementation-specific parameters makes N2M–RSI broadly applicable but also means its claims are conditional on idealized assumptions; real-world systems may rarely satisfy perfect injectivity or monotonic updates. Nonetheless, the variance-driven trade-offs articulated in (Chojecki, 2 Dec 2025) and looped denoising results in (Cameron et al., 20 Apr 2026) provide robust empirical evidence for the model's relevance to modern recursive architecture regimes.
A plausible implication is that as multi-agent LLMs and recursive transformer systems are scaled up, deliberate architectural throttles based on N2M–RSI criteria will become central to controllability and safe auto-scaling. The formal divergence theorems and swarm effects may guide policy interventions and interpretability in future self-bootstrapping AI systems.