- The paper presents a declared reflective runtime protocol that decomposes agent functions for precise measurement of LLM contributions.
- Explicit world-model planning dramatically boosts win rates, illustrating the value of externalized reasoning in agent performance.
- Sparse LLM revision offers only marginal F1 score improvements while introducing performance tradeoffs, emphasizing the benefit of symbolic control mechanisms.
Empirical Decomposition of LLM, Planning, and Reflection in Self-Revising Agents
Introduction
This work rigorously investigates the question of how much competence in LLM-based agents is attributable to the LLM versus the explicit symbolic and reflective scaffolding wrapped around it. Recent practice in LLM agents conflates world modeling, planning, and reflection in a monolithic prompt loop, obfuscating the provenance and impact of each component. The paper presents an empirical approach, instantiating a protocol that partitions agent mechanisms into inspectable runtime constructs. This “declared reflective runtime protocol” isolates posterior belief tracking, world-model planning, symbolic reflection, and sparse LLM revision into discrete layers. By evaluating these decomposed variants in the context of noisy Collaborative Battleship, the authors produce precise measurements of each layer’s marginal contribution to agent performance.
Declared Runtime Protocol and Agent Decomposition
The core protocol externalizes key agent mechanisms as explicit, declarative structures, tracked and manipulated outside the LLM, with four principal elements:
- State: explicitly tracked world and agent state.
- Signals: deterministic confidence and revision eligibility computations.
- Guarded Actions: revision and policy patching only when precisely stipulated runtime criteria are met.
- Hypothetical Transitions: forward-simulated state transitions for planning.
Agent instantiations form a sequence of ablated variants:
- greedy+MCMC: Posterior-based policy with no planning or revision.
- WMA: Adds explicit world-model planning and question selection.
- MRA: Adds symbolic in-episode self-revision, realized without invoking any LLM.
- MRA-LLM: Permits LLM intervention as a conditional, runtime-gated revision mechanism.
The main architectural distinction is that metacognitive and reflective operations are not latent behaviors emergent from LLM prompting, but declared, inspectable control flows amenable to precise ablation and measurement.
Empirical Findings
World-Model Planning and Interrogative Strategy
Explicit world-model planning, realized in WMA, dramatically outperforms the pure posterior baseline. The win rate increase (+24.1 points, from 50.0% to 74.1%) vastly outpaces the change in average F1 (+0.017), indicating that explicit question selection affects game success more strongly than fine-grained targeting precision. These results support the hypothesis that reasoning over explicit world structure, even with simple policies, is a major driver of agent success.
Symbolic Reflection and Self-Revision
Introduction of symbolic reflection (MRA) does not uniformly confer improvement; average win rate and F1 remain constant or marginally regress. However, case analysis reveals that symbolic revisions are critical in specific board/game configurations (e.g., B17-seed0), but suboptimal policy presets may induce harmful revisions elsewhere. The key contribution is methodological: by isolating reflective mechanisms, diagnosable runtime structures identify failure modes and enable targeted calibration of revision strategies rather than opaque prompt tuning.
Sparse LLM Revision
MRA-LLM introduces LLM-based revision as a runtime-instrumented intervention, invoked at 4.3% of turns when permissive gating is enabled. The effect is characterized by non-monotonic and tradeoff-prone performance: average F1 increases slightly (+0.005, to 0.557), but win rate drops (to 53.7%). This indicates that while LLM revision can enhance local action quality, it can impair completion efficiency by consuming critical question/shot resources. The explicit externalization of the reflective protocol makes these effects measurable and reproducible, in stark contrast to prompt-embedded alternatives.
Theoretical and Methodological Implications
The primary scientific impact is the demonstration that symbolic, declaratively implemented layers—explicit planning, runtime-calibrated reflection—can account for the majority of agent efficacy, with the LLM confined to a residual, quantitatively bounded role. This reconfigures the locus of intelligence in LLM agents: the protocol demonstrates that with careful scaffolding, most world modeling, planning, and even reflection can be externalized and directly interrogated, reserving LLM invocation for well-justified, context-specific cases.
Notably, this decomposition allows agent designers to precisely attribute and optimize where LLM interventions are empirically beneficial, moving away from undiagnosable prompt engineering paradigms. The method proposed offers a template for robust agent evaluation that can, in principle, be generalized beyond the Battleship domain to any structured, partially observable environment.
Potential Future Directions
- Cross-domain Validation: The protocol’s generality allows extension to other domains with hybrid symbolic-LLM agent architectures.
- Adaptive Calibration: Future work may optimize revision policy parameters dynamically, informed by meta-level learning rather than hand-tuned presets.
- Broader LLM Invocations: Investigating more granular or strategic LLM invocation schedules could further clarify the boundaries and utility of LLM-driven revision.
- Integration with Language-grounded Belief Models: Combining this decomposition with language-informed belief updates (e.g., LIPS-style mechanisms) offers a promising avenue for agents operating in more open-ended linguistic environments.
Conclusion
Explicit decomposition of agent mechanisms into belief tracking, world-model-based planning, symbolic reflection, and sparsely gated LLM revision yields a rigorous framework for measuring the marginal contribution of LLMs in self-revising agents. The majority of agent competence in the tested regime is attributable to explicit planning; symbolic reflection operates as a tangible runtime mechanism, though its aggregate benefit is sensitive to calibration. Sparse LLM revision exerts only bounded and non-monotonic influence. These results recommend a design posture: maximize the declared, symbolic substrate and reserve LLMs for empirically justified intervention, instrumented through declarative reflective protocols. This approach shifts the core methodology of LLM-agent research from prompt entanglement to auditable, structure-centric development.