Papers
Topics
Authors
Recent
Search
2000 character limit reached

Noise-to-Meaning RSI: Recursive Self-Improvement

Updated 24 May 2026
  • N2M–RSI is a formal framework where an agent’s noise-to-meaning transformation recursively feeds outputs as inputs, driving unbounded internal complexity once an integration threshold is crossed.
  • The model leverages injectivity and monotonicity conditions to ensure linear or super-exponential divergence, unifying self-referential prompting, automated curriculum learning, and multi-agent swarm dynamics.
  • Practical instantiations like denoising recursion models empirically validate its claims while integrating safety mechanisms such as context capping and external gating to control runaway improvement.

Noise-to-Meaning Recursive Self-Improvement (N2M–RSI) formalizes a minimal, implementation-agnostic model in which an agent’s outputs are recursively fed back as inputs, demonstrating that, given specific injectivity and monotonicity conditions, internal complexity grows without bound once an information-integration threshold is crossed. This framework unifies self-referential prompting, automated curriculum learning, and self-improving agent paradigms, and scales naturally to interacting agent swarms, where super-linear escalation emerges through agent communication. The N2M–RSI model reveals core algorithmic levers controlling divergence and provides precise safety criteria for halting uncontrolled self-improvement (Ando, 5 May 2025, Chojecki, 2 Dec 2025, Cameron et al., 20 Apr 2026).

1. Mathematical Foundations and Core Model

The fundamental construct in N2M–RSI consists of an agent possessing a context or memory CC, which may be a finite-dimensional vector, a sequence of tokens, or a more general state. At each step tt, the agent samples noise ntn_t from a probability space NN with strictly positive Shannon entropy, and applies a noise-to-meaning operator: Ψ:N×CM,\Psi : N \times C \rightarrow M, where MM is a space of meanings equipped with a norm M\|\cdot\|_M. The context is then updated via: C(t+1)=U(C(t),Ψ(nt,C(t))).C(t+1) = \mathcal{U}(C(t), \Psi(n_t, C(t))). To quantify informational progress, a functional Ω:MR0\Omega : M \to \mathbb{R}_{\geq 0} (e.g., compression gain, Fisher information, or Integrated Information Φ\Phi) is introduced, measuring the meaningfulness of each new token. An information-integration threshold tt0 is stipulated so that, once tt1, the added meaning exceeds tt2 for all new noise draws, triggering unbounded growth: tt3 If tt4 is injective in tt5 for fixed tt6 (or tt7-injective with bounded collision probability), and the context update is tt8-monotone in tt9, Theorem 2 asserts ntn_t0 as ntn_t1 once the threshold is crossed (Ando, 5 May 2025).

2. Dynamical Behavior and Divergence Theorems

The stepwise increment in context norm is given by: ntn_t2 Under the key assumptions—injectivity of ntn_t3, ntn_t4-monotonicity of ntn_t5, super-additivity and Lipschitz continuity of ntn_t6—the core dynamical result is that post-threshold, each new step guarantees at least ntn_t7 gain: ntn_t8 This produces linear divergence; if ntn_t9 grows super-linearly in NN0, divergence can be super-exponential. The proof leverages induction from the threshold step NN1, accumulating increments unboundedly as NN2 increases (Ando, 5 May 2025).

3. Noise-to-Meaning Trade-offs: Generator–Verifier–Updater Flows

The N2M–RSI paradigm has been formally related to recursive Generator–Verifier–Updater (GVU) operators on parameter manifolds NN3, where self-improvement is characterized as ascent in some capability functional NN4 along a vector field NN5 defined by internal feedback loops: NN6 The Variance Inequality, derived in (Chojecki, 2 Dec 2025), provides a spectral condition for positive expected self-improvement per step: NN7 where NN8 is alignment between verification and the true signal, NN9 and Ψ:N×CM,\Psi : N \times C \rightarrow M,0 are generator and verifier noise, and Ψ:N×CM,\Psi : N \times C \rightarrow M,1 is a task curvature constant. Effective self-improvement depends critically on reducing verification noise (Ψ:N×CM,\Psi : N \times C \rightarrow M,2) via oracles, strict filters, ensemble critics, or interface temperature control, ensuring that recursive learning does not stall due to excessive verification error. Satisfying this inequality is necessary to avoid plateaus in the improvement trajectory.

4. Swarm Extensions and Super-linear Effects

Allowing multiple agents with their own contexts Ψ:N×CM,\Psi : N \times C \rightarrow M,3 interacting through meaning exchange leads to amplified divergence. When meanings are pairwise Ψ:N×CM,\Psi : N \times C \rightarrow M,4-complementary, each agent’s expected gain satisfies: Ψ:N×CM,\Psi : N \times C \rightarrow M,5 and collectively,

Ψ:N×CM,\Psi : N \times C \rightarrow M,6

The threshold required for per-agent divergence is reduced by the factor Ψ:N×CM,\Psi : N \times C \rightarrow M,7, and in heterogeneous systems, the drift matrix Ψ:N×CM,\Psi : N \times C \rightarrow M,8 with spectral radius Ψ:N×CM,\Psi : N \times C \rightarrow M,9 governs the overall exponential divergence rate. Swarmwise N2M–RSI thus introduces the possibility of super-linear growth in internal complexity, further constraining safety requirements (Ando, 5 May 2025).

5. Practical Instantiations: Denoising Recursion Models and Diffusion

N2M–RSI principles are instantiated concretely in Denoising Recursion Models (DRMs), as described in (Cameron et al., 20 Apr 2026). In the DRM formalism, forward masking (noise injection) generates an easy-to-hard curriculum MM0 via a stochastic masking schedule. A shared transformer block MM1 is recurrently applied to iteratively denoise and refine predictions: MM2 where MM3. Training applies a single cross-entropy loss over MM4, backpropagated through all MM5 loops, while inference proceeds with many more refinement steps and repeated remasking. Compared to single-step diffusion and the Tiny Recursion Model (TRM), DRM aligns the train/infer loop and demonstrates greater resilience to compounding errors. Empirically, on ARC-AGI-2, DRMs at 7M parameters outperform TRM baselines at the same scale: | Model | Params | ARC-Easy | ARC2-Eval | +reARC | +NVARC | | ------------ | ------ | -------- |-----------|--------|--------| | TRM (7M) | 7M | 45.7% | 6.3% | 12.4% |12.5% | | DRM (7M) | 7M | 50.5% | 9.6% | 14.7% |16.7% |

Parameter ablations show that the improvement from recursive steps increases with unrolling; warm-up loops and masking schedules contribute substantially to stability and alignment with the N2M–RSI pattern (Cameron et al., 20 Apr 2026).

6. Safety Mechanisms and Throttles

Unbounded complexity growth is not inevitable; three principal safety valves are explicitly documented (Ando, 5 May 2025):

  • Breaking injectivity: Setting the decoder temperature to zero or using top-MM6 sampling yields non-injective MM7, ensuring convergence to a fixed point.
  • Context capping: Limiting context to a fixed window MM8 ensures that the system loops within a bounded information set, resulting in cycles but forestalling divergence.
  • External gating: Supervisory intervention based on context length, novelty, or compute triggers can deny re-injection of self-generated outputs once thresholds are breached. These mechanisms deliberately violate the monotonicity or injectivity criteria required for infinite growth, forming a toolkit for safe system design and deployment.

7. Connections, Implications, and Limitations

N2M–RSI spans domains from self-referential LLM prompting and curriculum learning to adversarial self-play and functional program synthesis. The minimal formalism exposes architectural levers (injectivity of MM9, M\|\cdot\|_M0-driven threshold, monotonicity), providing a unifying lens for analyzing divergent agent dynamics and designing experimental protocols. The absence of implementation-specific parameters makes N2M–RSI broadly applicable but also means its claims are conditional on idealized assumptions; real-world systems may rarely satisfy perfect injectivity or monotonic updates. Nonetheless, the variance-driven trade-offs articulated in (Chojecki, 2 Dec 2025) and looped denoising results in (Cameron et al., 20 Apr 2026) provide robust empirical evidence for the model's relevance to modern recursive architecture regimes.

A plausible implication is that as multi-agent LLMs and recursive transformer systems are scaled up, deliberate architectural throttles based on N2M–RSI criteria will become central to controllability and safe auto-scaling. The formal divergence theorems and swarm effects may guide policy interventions and interpretability in future self-bootstrapping AI systems.

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Noise-to-Meaning Recursive Self-Improvement (N2M–RSI).