Cognitive Friction: A Decision-Theoretic Framework for Bounded Deliberation in Tool-Using Agents

Published 31 Mar 2026 in cs.AI | (2603.30031v3)

Abstract: Autonomous tool-using agents in networked environments must decide which information source to query and when to stop querying and act. Without principled bounds on information-acquisition costs, unconstrained agents exhibit systematic failure modes: excessive tool use under congestion, prolonged deliberation under time decay, and brittle behavior under ambiguous evidence. We propose the Triadic Cognitive Architecture (TCA), a decision-theoretic framework that formalizes these failure modes via cognitive friction. By combining nonlinear filtering, congestion-dependent cost dynamics, and HJB optimal stopping, TCA models deliberation as stochastic control over a joint belief-congestion state, explicitly pricing information by tool signal quality and live network load. TCA yields an HJB-inspired stopping boundary and a computable rollout-based approximation of belief-dependent value-of-information with a net-utility halting condition. We validate TCA in two controlled environments (EMDG and NSTG) designed to isolate stopping quality, action selection under congestion, and temporal urgency. TCA improves resource outcomes while reducing time-to-action without degrading accuracy, gaining 36 viability points in EMDG and 33 integrity points in NSTG over greedy baselines. Ablations show that selection and stopping must be optimized jointly, as stopping rules alone recover at most 4 viability points. Sensitivity sweeps over alpha, beta, and lambda_S yield stable accuracy and interpretable trade-offs, and a continuation-value sweep over eta values 0, 0.1, 0.3, and 0.5 finds eta equal to zero is optimal under high temporal urgency. Finally, we demonstrate an illustrative instantiation around a black-box LLM on a memorisation-free corpus, where the same stopping principle executes using empirically computable uncertainty and value-of-information proxies.

Abstract PDF Upgrade to Chat

Authors (1)

Davide Di Gioia

Summary

The paper introduces the Triadic Cognitive Architecture that integrates spatial, temporal, and epistemic cost modeling for bounded autonomous decision-making.
It applies formal tools from HJB optimal stopping and nonlinear filtering to couple belief updates with network congestion, validated on synthetic testbeds.
Empirical results show significant gains (30–36 point improvements and drastically reduced deliberation steps) over baselines, enhancing resource viability.

Cognitive Friction and the Triadic Cognitive Architecture for Bounded Deliberation

Problem Formulation and Motivation

Autonomous tool-using agents deployed in networked and dynamic environments confront the intertwined challenges of evidence acquisition, judicious action sequencing, and effective stopping criteria. Existing agentic frameworks—exemplified by ReAct, Tree-of-Thoughts, and similar paradigms—typically proceed under cost-free information acquisition assumptions, resulting in well-known pathologies: congestion saturation due to indiscriminate querying, infinite deliberation stemming from time-insensitive reasoning, and epistemic collapse when confronting mutually inconsistent evidence. These failure modes reveal a critical lacuna: the absence of formal, state-dependent friction terms governing space, time, and epistemic resolution.

The "Cognitive Friction: A Decision-Theoretic Framework for Bounded Deliberation in Tool-Using Agents" (2603.30031) directly addresses these lacunae, positing that robust autonomous behavior necessitates explicit deliberation pricing—structuring agentic decision-making as a continuous-time, stochastic control problem over a joint belief-congestion state space. The Triadic Cognitive Architecture (TCA) is introduced as the concrete mathematical and practical formalism that realizes this objective.

Formal Architecture and Analytical Framework

The TCA synthesizes key constructs from nonlinear filtering theory, congestion-sensitive cost modeling, and Hamilton-Jacobi-Bellman (HJB) optimal stopping:

Belief State and Congestion Coupling: An agent's state is represented as $(p_t, C_t)$ , with $p_t$ encoding the agent's current probabilistic beliefs about the true hypothesis, and $C_t$ modeling accumulated network congestion following action-dependent increments $Q(u)$ . This induces dynamic, state-dependent spatial (congestion-based) and temporal (latency-based) cost profiles.
Information Acquisition under Cost: Each tool-query is formalized as a stochastic observation process with defined signal profiles, filtered via Wonham dynamics. Non-uniform tool signal quality and latency are explicitly encoded, with information gain quantified as the expected entropy reduction in the agent's belief.
HJB-Derived Optimal Stopping: The value function is characterized by a variational inequality that incorporates both expected information gain and explicit cost terms for congestion and delay. This yields a principled stopping boundary: halt deliberation the moment marginal VOI is eclipsed by the net cognitive friction.
Discrete-Time Realization via Rollouts: Practical instantiation leverages Monte Carlo rollouts to estimate one-step VOI and applies a myopic net-utility halting condition. Approximation errors are formally bounded using probabilistic concentration (Hoeffding's inequality) and temporal discretization analysis.

This formulation establishes a direct mapping between physical resource utilization and epistemic deliberation, constituting a rigorous bridge between classical bounded rationality and modern agentic AI.

Empirical Evaluation and Numerical Results

Synthetic Testbeds: EMDG and NSTG

Empirical validation occurs on two synthetic but structurally distinct environments:

Emergency Medical Diagnostic Grid (EMDG): The agent must diagnose one of several critical pathologies via tool queries (e.g., rapid blood tests, MRI), with both temporal decay (patient viability) and query-induced congestion.
Network Security Triage Grid (NSTG): The agent conducts network threat analysis using heterogeneous digital forensics tools, subject to network congestion and system integrity decay.

Key Results:

In EMDG, TCA yields a 36.3-point improvement in resource viability (93.0 vs. 56.8) compared to unconstrained greedy baselines, while drastically reducing deliberation time (14.5 vs. 114.5 steps)—all at matching accuracy. Similar results hold in NSTG, with a 33.1-point gain in system integrity and >15x reduction in action time.
Ablation studies demonstrate that the achieved resource efficiency is not explained by stopping rules alone; cost-aware tool selection and explicit friction pricing are critical. Purposive stopping baselines with greedy tool selection close at most 4 points of the viability gap, while TCA maintains dominance across all tested configurations.
One-step lookahead (myopic) stopping is not merely an efficient heuristic but, in high-urgency regimes such as EMDG, is empirically exact—higher-order rollouts do not yield additional value.
Robustness to action-space scaling is established; with increased tool diversity, TCA maintains consistent resource advantages and effectively selects higher-latency tools only when marginal information gain justifies their cost.

LLM-Based Corpus Retrieval

An illustrative real-world instantiation demonstrates TCA's applicability to LLM-driven retrieval. On a regime constructed for zero-knowledge contamination, the TCA controller, leveraging empirical uncertainty and VOI proxies, selectively triggers costly retrieval and yields both ACT and DEFER terminal states as predicted by theory. The observed behavior further validates the TCA stopping criterion's practical computability outside of synthetic setups.

Theoretical and Practical Implications

Foundational Advances

TCA provides a formal framework for cognitive friction—the structural coupling of spatial, temporal, and epistemic frictions in agentic deliberation. It generalizes prior work on bounded rationality and meta-reasoning by introducing state-dependent, environment-grounded cost modeling; classical bounded optimality is strictly subsumed by this triadic approach. Unlike previous approaches with static or heuristic cost scalars, TCA enforces monotonic stopping regions and cost-coupled action selection, preventing non-monotone or oscillatory deliberation trajectories.

Relevance to Agentic AI

By formally embedding computation-as-a-cost in the agentic loop, TCA precludes the standard failure modes endemic to unconstrained large-language-model-based agents: overconsumption of networked resources, infinite or redundant deliberation, and maladaptive epistemic synthesis under ambiguous evidence. The framework naturally scales to heterogeneous environments and action spaces, and is robust to approximations of belief and cost proxies, making it directly relevant to the construction of practical, tool-using autonomous systems.

Limitations and Future Directions

Current limitations include reliance on synthetic environments with tractable belief dynamics, monotone congestion models, and single-agent focus. Expanding TCA to multi-agent settings with explicit game-theoretic coupling, realized real-world data, and implicit belief updates in neural architectures is a direct avenue for future research. Additionally, efficient amortization and learning-based surrogates for VOI estimation are required for scaling to extensive tool ecosystems.

Conclusion

The "Cognitive Friction" framework and the Triadic Cognitive Architecture jointly offer a principled, mathematically grounded approach for bounded deliberation in tool-using agents. By recasting deliberation as a controlled trajectory through state-congested, temporally decaying, and epistemically uncertain environments, TCA provides actionable criteria for when to query, what to query, and, crucially, when to execute. Substantial empirical gains in resource outcomes (30–36 point improvements over baselines) are achieved without sacrificing accuracy, and the domain-agnostic structure foreshadows direct applicability to a broad class of agentic AI systems. The formalism substantially advances the design of safe, scalable autonomy by internalizing the true physical and epistemic costs of decision-making in complex environments.