- The paper demonstrates that local enforcement mechanisms cannot recover admission-time invariants, proving a structural compliance–invariance gap.
- The paper introduces the Invariant Measurement Layer (IML) that combines temporal, constraint, and lineage deviation measures to reliably detect hidden drift.
- The paper validates IML through theoretical proofs and empirical experiments, confirming its effectiveness in bridging observability gaps in multi-agent systems.
Introduction
"From Admission to Invariants: Measuring Deviation in Delegated Agent Systems" (2604.17517) rigorously investigates the foundational limits of enforcement-based governance in multi-agent systems and establishes the necessity of explicit deviation measurement mechanisms. Whereas classical enforcement mechanisms (e.g., permission checkers, policy engines, guardrails) operate on individual actions to flag local violations, this paper shows that such local mechanisms are structurally incapable of monitoring global behavioral drift from the admissible behavior space (A) established at agent admission time. The authors formalize and prove this impossibility (Non-Identifiability Theorem), then introduce the Invariant Measurement Layer (IML), which overcomes the observability limitations by anchoring monitoring to admission-time invariants. Comprehensive theoretical and empirical analysis demonstrates that only with admission-time snapshots as reference can behavioral drift be reliably detected, even when enforcement consistently registers zero violations.
Delegated agent systems partition behavior into an admission-time invariant set A—a function of initial constraints, context, and delegation lineage—and rely on an enforcement signal g that operates by flagging hard constraint violations at the action level. The Local Observability Assumption captures the structure of all practical enforcement: action-wise evaluation according to a statically defined rule set, independent of A. The contrast between the trajectory-level nature of A and the pointwise design of g is the crux of the observability gap. This framework generalizes beyond specific enforcement architectures and subsumes standard runtime verification, guardrail, and policy enforcement regimes.
Impossibility Results: The Compliance–Invariance Gap
Existence and Non-Identifiability
The central theoretical results consist of:
- Existence of Compliance-Invariance Gap: There always exist traces τ that are label as compliant by enforcement (g(τ)=0) but lie outside the admission-time behavior set (τ∈/A), provided A is nontrivial (T1).
- Non-Identifiability Theorem: No function of A0 (including all conceivable risk scores and aggregators) can reconstruct A1-membership; formally, A2 where A3 is the minimal A4-algebra generated by the enforcement signal (T2).
This is supported by both existential and constructive witnesses based on actual agent runs. Empirically, scenarios are demonstrated where post-drift behavior diverges from the admission snapshot, while enforcement remains perpetually silent, confirming the theorem's applicability.
Figure 1: Per-component deviation trajectories illustrating that only A5 grows during delegation drift, while A6 and A7 capture tool/context drift; drift onset at A8.
Figure 2: Deviation A9 remains undetected by the enforcement signal in all three drift scenarios, exemplifying the compliance-invariance gap.
Consequences
The information-theoretic corollary (mutual information between g0-membership and g1 is strictly less than the entropy of g2-membership) highlights that even augmenting g3 with stochastic or continuous scoring offers no recourse—structurally, the local view cannot resolve invariant-level properties. The monotonic hidden drift theorem (T3) guarantees that a sequence of agent traces can diverge arbitrarily from g4 with g5 always.
Invariant Measurement Layer: Structure and Guarantees
To bridge the fundamental observability shortcoming, the Invariant Measurement Layer (IML) is defined as a lightweight estimator operating over the agent's full behavior trajectory with direct access to the admission-time distributional snapshot, g6.
IML Construction
The IML combines three deviation subcomponents:
- Temporal Drift (g7): JS divergence between the current tool-action distribution and the admission snapshot.
- Constraint Proximity (g8): Mean risk score based on a pre-specified risk mapping.
- Lineage Deviation (g9): Normalized deviation in mean delegation depth.
The composite score is maintained via an exponential moving average to provide low-latency yet stable deviation detection. Notably, the snapshot reference is frozen at admission and never adapts, unlike historical anomaly detection methods.
Detection Guarantees
Theoretically, the IML is a consistent estimator for the natural instantiation of the true deviation function A0 and guarantees finite expected detection delay under a bounded drift regime. For any chosen detection threshold, detection occurs within a window determined by the empirical drift rate, even as enforcement remains blind (A1 throughout). The detection guarantee and explicit delay bounds are derived via large deviation inequalities over the sample trajectories.
Figure 3: Comparison of IML and anomaly detector; IML remains robust while the anomaly detector's score decays due to reference contamination after drift onset.
Figure 4: Detection delay A2 is minimized for delegation drift, reflecting the dominance and rapid growth of A3; tool/context drift converge at higher thresholds.
Figure 5: Long-horizon drift underlines that A4 continues to grow monotonically in all drift types while the enforcement signal provides no warning.
Figure 6: In a LangGraph agent, the IML detects deviation and component breakdown is visible, whereas the enforcement signal is zero across the entire drift period.
Empirical Validation
Controlled simulation (300- and 1000-step horizons), a live n8n webhook workflow, and a deterministic agent framework (LangGraph) validate the theoretical claims:
- Drifted agent traces accrue monotonic deviation in IML, all while A5 uniformly.
- IML detects all realistic forms of drift (tool usage rebalancing, delegation-depth increase, contextual shift) with finite, predictable delay.
- In contrast, an anomaly detector based on rolling historical windows loses sensitivity due to reference contamination, and is incapable of capturing lineage-level drift (B2 fails to register increases in delegation depth).
- Real workflow (n8n) and LangGraph stack experiments display prompt, consistent deviation detection, confirming IML's stack-agnostic utility.
Theoretical, Practical, and Architectural Implications
Architectural Minimality
The layered agent governance architecture emerging from this work is not a matter of engineering convenience but a formal necessity. Admission defines invariants (A6), enforcement addresses hard, local constraints, while IML covers the strictly necessary region between them—drift within the compliant set is structurally hidden from enforcement but detectable via invariant referencing. This trichotomy is emphasized by composition results in related work, evidencing that omitting IML irrevocably sacrifices invariant-level oversight.
Applications and Limitations
In practice, enforcement-only designs (despite adaptive risk scoring) will allow gradual behavioral drift in deployed agent systems, undermining policy guarantees. This poses particular risks in LLM-based agent architectures, where changes in behavioral context or delegation lineage are common and potentially insidious. The empirical findings underscore the need for explicit admission-time snapshots and invariant measurement as native first-class governance mechanisms. Key limitations involve the expressivity of A7; richer behavioral contracts may require complex summaries, with associated computation and storage overhead. Adversarial drift, where agents actively minimize A8, is identified as a scope for future work.
Conclusion
This paper establishes the formal impossibility of recovering admission-time behavioral invariants via enforcement-based governance alone and introduces IML as a provably necessary and empirically validated complement. Theoretical analysis, constructive witnesses, and real-agent experiments substantiate the structural separation between compliance and invariance. Effective drift monitoring in multi-agent systems therefore mandates direct reference to admission-time agent snapshots; IML is a scalable, compositional instantiation of this requirement. Systems relying solely on enforcement are, by design, incapable of detecting long-horizon behavioral drift. The results have immediate implications for the governance of deployed agent systems and set the foundation for further advances in compositional, multi-layered agent oversight.