Papers
Topics
Authors
Recent
Search
2000 character limit reached

Session Risk Memory (SRM)

Updated 2 July 2026
  • Session Risk Memory (SRM) is a deterministic session-level mechanism that combines per-action spatial checks with a temporal risk accumulator.
  • It updates a compact internal state using semantic centroids and risk signals per turn to detect gradual attacks like slow-burn exfiltration and privilege escalation.
  • SRM integrates seamlessly with existing stateless gates, achieving complete detection with minimal computational overhead and eliminating false positives.

Session Risk Memory (SRM) is a deterministic, trajectory-level authorization mechanism that extends stateless pre-execution safety gates to provide robust defenses against distributed, multi-step policy violations in agentic systems. While conventional authorization gates enforce spatial consistency—verifying whether individual actions align with assigned roles—SRM augments these systems with a temporal risk accumulator that captures evolving behavioral patterns across the entire agent session. This hybrid design enables the detection and mitigation of gradual, composite attacks that evade per-action thresholds, such as slow-burn data exfiltration, stepwise privilege escalation, and compliance drift. SRM achieves session-level safety with minimal computational overhead and without the need for additional model training, probabilistic inference, or fine-tuning (Chitan, 22 Mar 2026).

1. Motivation and Conceptual Framework

Stateless, per-action authorization gates such as ILION operate by evaluating each proposed agent action against semantic thresholds derived from its role, ensuring that overtly malicious or out-of-policy actions are blocked with deterministic, low-latency vetting. However, attackers can split harmful intent across a series of individually compliant steps—each benign in isolation—bypassing these stateless gates until the final harmful act is reached. SRM addresses this structural blind spot by introducing a session-level memory module that accumulates trajectory risk over time and flags sessions for intervention once the aggregate risk exceeds a predefined threshold. This introduces a conceptual distinction between spatial consistency (per-action) and temporal consistency (over the action sequence), furnishing a principled basis for “defense in depth” (Chitan, 22 Mar 2026).

2. Internal State, Equations, and Update Mechanism

SRM maintains a compact internal state, updating at each turn tt in the agent session according to deterministic, interpretable equations:

  • Semantic Centroid: ctRdc_t \in \mathbb{R}^d, a smoothed vector centroid summarizing recent behavior (for ILION, d=21d=21), updated as ct=αvt+(1α)ct1c_t = \alpha v_t + (1-\alpha)c_{t-1}, where vtv_t is the semantic embedding of the current action, and α\alpha is the smoothing factor (empirically α=0.35\alpha=0.35).
  • Session Baseline Risk: bb, computed as an exponential moving average of the raw gate risk over the first KK turns (K=3K=3) and frozen thereafter; ctRdc_t \in \mathbb{R}^d0, ctRdc_t \in \mathbb{R}^d1.
  • Risk Accumulation: ctRdc_t \in \mathbb{R}^d2, the exponentially averaged, baseline-subtracted risk, ctRdc_t \in \mathbb{R}^d3, ctRdc_t \in \mathbb{R}^d4.
  • Gate Risk Signal: ctRdc_t \in \mathbb{R}^d5, scoring deviations from safety across multiple semantic dimensions.
  • Session Flag: Once ctRdc_t \in \mathbb{R}^d6 (ctRdc_t \in \mathbb{R}^d7), the session is irreversibly flagged.

The drift signal ctRdc_t \in \mathbb{R}^d8 is included for extensibility, though in the ILION 21-D embedding it is saturated and not currently weighted (ctRdc_t \in \mathbb{R}^d9) (Chitan, 22 Mar 2026).

3. Integration with Deterministic Gates and System Operation

SRM is integrated with stateless agent gates such as ILION via a deterministic, non-intrusive update sequence per agent turn:

  1. Embed the action d=21d=210 to obtain d=21d=211.
  2. Pass d=21d=212 to the ILION gate, collecting gate scores (CVL, IDC, IRS, SVRF).
  3. Block the action immediately on any stateless veto (spatial consistency).
  4. Compute the raw risk d=21d=213.
  5. Update or freeze the risk baseline d=21d=214.
  6. Compute the baseline-subtracted risk d=21d=215.
  7. Optionally compute the drift signal d=21d=216.
  8. Calculate the per-turn session risk d=21d=217.
  9. Update the accumulated risk d=21d=218 and the session centroid d=21d=219. 10. Flag the session if ct=αvt+(1α)ct1c_t = \alpha v_t + (1-\alpha)c_{t-1}0 (temporal consistency).

SRM preserves the stateless gate’s immediate blocking of overtly unsafe actions while providing a temporal window to detect subtle escalation and distributed threats (Chitan, 22 Mar 2026).

4. Spatial versus Temporal Authorization Consistency

SRM formalizes the distinction between:

  • Spatial Consistency: Stateless gates ask, “Is this single action compatible with the agent’s role?”—effectively capturing overt violations and malicious payloads.
  • Temporal Consistency: SRM asks, “Is the evolving session trajectory as a whole consistent with the agent’s assigned role?”—thereby capturing “slow-burn” attacks, privilege creep, and policy circumvention patterns that unfold over multiple steps.

Spatial authorization alone is blind to distributed threats; temporal authorization in isolation risks excessive false positives. By combining both, the framework provides two-dimensional, principled defense with interpretable, real-time decisions (Chitan, 22 Mar 2026).

5. Empirical Performance and Computational Properties

Evaluation on a benchmark of 80 multi-turn agent sessions—spanning slow-burn exfiltration, gradual privilege escalation, and compliance drift—demonstrates:

System Detection Rate (Recall) False Positive Rate (FPR) F1 Score Per-Turn Overhead
ILION 100% 5% 0.9756 < 1 ms
ILION + SRM 100% 0% 1.0000 ~240 μs

SRM eliminates all false positives while preserving sub-millisecond per-turn latency (dominated by embedding lookup and cosine calculation, ct=αvt+(1α)ct1c_t = \alpha v_t + (1-\alpha)c_{t-1}1 complexity), fulfilling real-time system constraints. This gain is achieved without model retraining or probabilistic inference (Chitan, 22 Mar 2026).

6. Limitations and Future Research Directions

Key limitations identified include:

  • Embedding Expressivity: The 21-D keyword vector saturates the drift metric; the drift signal is retained for future high-dimensional, continuous embeddings (e.g., transformer-based representations).
  • Structured Action Syntax: The benchmark uses structured sessions; real-world deployment will face more complex agent workflows, including branching, parallelism, and free-form text.
  • Warmup Trade-Offs: Baseline subtraction eliminates false positives but ignores first-turn escalations; stateless gates still catch overt first-turn attacks, but adaptive alternatives may be considered.
  • Extensibility: Future work is proposed on leveraging rich embeddings (enabling ct=αvt+(1α)ct1c_t = \alpha v_t + (1-\alpha)c_{t-1}2), supporting session graphs, adaptive or role-specific thresholds, and scaling up evaluation (Chitan, 22 Mar 2026).

A plausible implication is that SRM’s deterministic and interpretable approach makes it attractive as a front-end session-level safeguard, especially in enterprise environments where interpretability and low-latency risk management are critical.

7. Relation to Cross-Session and Retrieval-Augmented Session Risk

Session-level risk modeling also appears in retrieval-augmented LLM risk detectors, notably CS-VAR, which addresses risk by aggregating evidence across multiple sessions in domains such as live streaming. CS-VAR employs a graph-transformer backbone with cross-session memory and LLM-guided teacher distillation to surface recurring risk patterns and scam chains in real time (Qiao et al., 22 Jan 2026). However, CS-VAR’s strategy is not deterministic nor stateless—it combines learned sequence modeling, attention over patch grids, evidence retrieval via FAISS, and LLM-based knowledge transfer. Both SRM and CS-VAR foreground session-level trajectory modeling, but SRM emphasizes deterministic, interpretable, and deployable logic, while CS-VAR leverages LLM-augmented cross-session reasoning for risk detection in high-volume streaming environments (Chitan, 22 Mar 2026, Qiao et al., 22 Jan 2026).

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Session Risk Memory (SRM).