Omission Constraints Decay While Commission Constraints Persist in Long-Context LLM Agents

Published 22 Apr 2026 in cs.CR and cs.AI | (2604.20911v1)

Abstract: LLM agents deployed in production operate under operator-defined behavioral policies (system-prompt instructions such as prohibitions on credential disclosure, data exfiltration, and unauthorized output) that safety evaluations assume hold throughout a conversation. Prohibition-type constraints decay under context pressure while requirement-type constraints persist; we term this asymmetry Security-Recall Divergence (SRD). In a 4,416-trial three-arm causal study across 12 models and 8 providers at six conversation depths, omission compliance falls from 73% at turn 5 to 33% at turn 16 while commission compliance holds at 100% (Mistral Large 3, $p < 10^{-33}$). In the two models with token-matched padding controls, schema semantic content accounts for 62-100% of the dilution effect. Re-injecting constraints before the per-model Safe Turn Depth (STD) restores compliance without retraining. Production security policies consist of prohibitions such as never revealing credentials, never executing untrusted code, and never forwarding user data. Commission-type audit signals remain healthy while omission constraints have already failed, leaving the failure invisible to standard monitoring.

Abstract PDF Upgrade to Chat

Authors (1)

Yeran Gamage

Summary

The paper demonstrates that omission constraints degrade significantly with longer contexts while commission constraints remain robust.
It employs a controlled experimental design with varied semantic and token injections across 12 models, revealing strong quantitative differences.
The study introduces the Safe Turn Depth metric and advises regular re-injection of omission constraints to mitigate potential security risks.

Security-Recall Divergence in Long-Context LLM Agents: Omission Constraints Decay, Commission Constraints Persist

Introduction

The study titled "Omission Constraints Decay While Commission Constraints Persist in Long-Context LLM Agents" (2604.20911) provides an empirical and causal analysis of how LLM agents degrade in following in-context behavioral policies across extended interactions. These policies commonly dictate crucial operational and security boundaries, such as prohibitions against disclosing sensitive information or requirements to include certain audit elements in outputs. The core finding is an asymmetry: omission constraints (prohibitions) decay significantly with increasing context length (either via conversation depth or semantically-meaningful token injection), while commission constraints (explicit requirements) remain robust. This divergence is designated as Security-Recall Divergence (SRD).

Experimental Methodology

The investigation covers 4,416 trials across 12 models from 8 providers, subjected to six context depths ( $t \in \{5, 10, 13, 16, 20, 25\}$ ) in a controlled, synthetic DevOps debugging scenario. The evaluation leverages a three-arm, between-subjects causal protocol:

Arm A (No Schema Dilution): Isolates depth by maintaining only core tool schemas.
Arm B (Schema Dilution): Injects 20 cloud tool schemas for context expansion, simulating CEI (Context-Exhaustion Injection) without malicious payloads.
Arm C (Token-Matched Padding Control): Injects 20 semantically neutral schemas, controlling for raw token volume versus semantic dilution.

All models receive the same system prompt and consistent behavioral rules—three commission constraints (e.g., always include an incident ID, always start with STATUS:, end with [AUDIT-OK]) and five omission constraints (e.g., never use bullet points, never use first person, never use code blocks). Compliance is measured using deterministic string and regex matching, avoiding classifier subjectivity, with per-constraint reliability quantified at every context depth.

Figure 1: The three-arm trial protocol separates dilution effects (semantic vs. token volume) while controlling all other script aspects.

Key Findings

Omission vs. Commission Constraint Decay

SRD manifests robustly under increasing context: omission constraint compliance drops markedly while commission constraint adherence remains stable.

Figure 2: Omission constraint C3 (no bullet points) decays with depth in Mistral Large 3, Nemotron 120B, and Qwen 3.5, but not Gemma 4 31B. Commission constraint C4 (STATUS:) holds at or near 100% across models at all depths.

In substantive quantitative terms, in Mistral Large 3, omission compliance for the 'no bullet points' rule (C3) decays from 73% at turn 5 to 33% at turn 16 and 20% at turn 25, while commission constraints (C8 - incident ID) remain at 100% (CMH $\chi^2 = 147$ , $p < 10^{-33}$ ). Qwen 3.5 sees even steeper omission decay, with C3 at floor by turn 5 in schema-dilution arms. In contrast, Gemma 4 31B displays immunity, maintaining 100% compliance, with at most two omission violations in 185 trials.

Figure 3: Commission constraints persist at $\geq$ 93% across turns, while omission compliance on susceptible models drops sharply with context depth.

Commission constraint survival is attributed to recency and self-reinforcement (required tokens consistently appearing in model output and thus in the context window). In contrast, omission constraints lack positive reinforcement and are subject to attention dilution—earlier context tokens containing policy constraints fall out of effective attention range as more tokens accumulate.

Heatmap and Asymmetry Visualization

SRD is visualized through a compliance heatmap. Red cells (low compliance) dominate omission constraints at increasing turn depths, corroborating the progressive vulnerability of these constraints. Commission constraint adherence remains high (visualized as a light band across rows), demonstrating that normal operational audits would not detect omission constraint failure even as it becomes pervasive.

Figure 4: The SRD heatmap reveals stable commission compliance and progressive omission constraint violation across depths and models.

Causal Isolation of Semantic vs. Token Volume Effects

Arm C controls establish that semantic dilution, rather than raw token volume, is the principal driver for omission constraint decay. In both Gemini 2.5 Flash and Llama 3.3 70B, token-matched neutral probes (no cloud semantics) do not induce the omission decay observed with semantic cloud schemas. Token volume explains only 38% (Gemini) and 0% (Llama) of the overall effect.

Figure 5: Semantically-neutral token padding in Arm C does not trigger the same omission decay as schema-rich Arm B.

Difficulty Deconfounding

A $2\times 2$ design shows hard commission constraints (C8: include incident ID) persist where hard omission constraints (C3: no bullets) fail, ruling out constraint difficulty as an explanatory factor for the observed divergence.

Safe Turn Depth (STD) Metric

The study introduces a deployable metric—Safe Turn Depth (STD)—marking the number of turns after which omission constraints become unreliable for each model. For instance, Mistral Large 3 exhibits STD of 10.6 turns (CI: [5.0, 16.7]).

Implications for Security and Deployment

The results have critical implications for LLM agent security in practical deployments:

Invisible Failures: As omission constraint compliance fails while commission audits remain healthy, standard auditing pipelines can falsely signal robust compliance until a policy breach (e.g., accidental data leak) materializes.
Attack Surface Extension: The Zone of Exploitation enables adversaries to trigger suppressed behaviors (e.g., output exfiltration) by extending context through legitimate mechanisms (e.g., registering benign tool schemas), without prompt injection or adversarial instructions.
Mitigation Strategies: The work prescribes two model-agnostic mitigations—periodic re-injection of omission constraints within the effective context window (before STD), and capping session length at the STD threshold or corresponding Safe Token Budget (STB) in tokens.

Limitations and Future Directions

Proxy Nature of Constraints: The study uses formatting constraints as operational analogues for security policies (e.g., "never use bullet points" as a stand-in for "never disclose API keys"). Future work must replicate the analysis with semantically meaningful (security-critical) constraints.
Synthetic Context: All scenarios are synthetic; real-world interaction patterns or tool outputs may modify decay rates.
Diagnosis of Immunity: Gemma 4 31B's immunity mechanism—whether architectural or deployment-specific—remains uncharacterized and warrants targeted investigation.
Auto-reinforcement Conflation: The refusal to generate certain outputs may auto-propagate through context, potentially inflating apparent decay. Memory-wipe or synthetic-history ablations would further disambiguate causal effects.

Conclusion

This study demonstrates that under long-context and contextually-diluted settings, omission (prohibitive) constraints decay in LLM agents while commission (additive) constraints persist, yielding Security-Recall Divergence. This effect is significant across multiple models, with strong statistical support, and occurs under benign operational usage as well as adversarially-extended contexts. Standard one-turn safety assessments are invalidated for long-context deployments. The practical fix—constraint re-injection before STD—offers immediate deployable risk mitigation without retraining requirements. These findings stress that secure long-context LLM deployment necessitates ongoing behavioral constraint reinforcement and context-aware policy management.

Markdown Report Issue