Papers
Topics
Authors
Recent
2000 character limit reached

LLM Agents

Updated 30 November 2025
  • LLM Agents are autonomous, modular systems that integrate large transformer models with specialized roles for reasoning, planning, and tool use.
  • They employ segmented workflows that reduce token usage, enhance precision, and enable robust iterative error recovery with human-in-the-loop support.
  • LLM Agents demonstrate practical applications in code refactoring, scientific discovery, and strategic planning while advancing privacy and explainability through structured protocols.

LLM agents are autonomous, modular components built around large transformer-based LLMs, configured to operate in stateful, multi-step environments. They encapsulate specialized reasoning, planning, tool-use, memory management, and actuation roles, often in orchestrated architectures where agents communicate, delegate, and collaborate. Unlike monolithic LLM pipelines, agentic architectures segment workflows into specialized personas or modules, often interfacing with external tools, deterministic analyzers, or other agents, yielding adaptive and auditable decision-making pipelines suited to complex tasks such as software engineering, scientific discovery, strategic planning, and technical operations (Tawosi et al., 3 Oct 2025, Mi et al., 6 Apr 2025, Pehlke et al., 10 Nov 2025).

1. Agentic Architectures and Operational Paradigms

LLM agents differ fundamentally from single-prompt LLM pipelines. Rather than feeding all context and instruction to a single LLM instance and receiving an opaque output, modern agentic frameworks decompose tasks into narrowly scoped components, each with a dedicated system prompt, context window, and error-handling logic. For example, the LADU (LLM Agents for Dependency Upgrades) system employs three agents:

  • Summary Agent: Generates structured codebase summaries using a Meta-RAG approach, aligning summaries with AST nodes to support efficient retrieval.
  • Control Agent: Orchestrates upgrade plans using contextual summaries, migration guides, and failure logs, and emits structured edit instructions.
  • Code Agent: Applies precise code edits, signals back for summary updates, and interfaces with build/test loops.

This modular approach reduces prompt context length (cutting token consumption up to 90%), improves the precision of code edits (71.4% vs. 17.2% for monolithic baselines), and enables robust iterative error recovery and human-in-the-loop handover (Tawosi et al., 3 Oct 2025). All communication between agents is via text prompts and their outputs; no non-LLM RPC or message buses are required.

2. Agent Abstractions and Formal Models

Systematic design has shifted towards computer-systems-inspired frameworks. An agent is typically formalized as a tuple: F=(P,C,M,T,A)F = (P, C, M, T, A) where:

  • PP: Perception—encoding raw observations to feature space.
  • CC: Cognition—reasoning/planning/control.
  • MM: Memory—short/long-term stores.
  • TT: Tools—external executors (APIs, code runners, simulators).
  • AA: Action—commands to the environment.

This design maps directly onto von Neumann or microservices principles, with abstraction and modularity enabling concurrent perception, reasoning, and action via independent process flows (Mi et al., 6 Apr 2025). Agent policies may be defined as: at=A(C(P(o1..ot),Mread(H,xt),T(qt)))a_t = A \bigl(C \bigl(P(o_1..o_t), M_{\rm read}(H,x_t), T(q_t) \bigr) \bigr) learning via in-context learning, fine-tuning, or reinforcement learning (PPO or RLHF variants).

In multi-agent contexts, each agent Ai=(Si,Oi,Ai,Mi,Ti,Ï€i)A_i=(S_i,O_i,A_i,M_i,T_i,\pi_i) maintains private state, observation, action, memory, tools, and a policy, with orchestration achieved via explicit protocols for message passing, task decomposition, and consensus (Yang et al., 21 Nov 2024).

3. Tool Use, Knowledge Discovery, and Advanced Reasoning

LLM agents frequently interface with tool APIs through classical function-calling, black-box experimentation, or automated web browsing. In knowledge discovery, agents use the ReAct paradigm, interleaving thought, action, and observation, and interact with scientific simulators or arbitrary function wrappers.

For instance, in atomic layer processing, agents operate without task optimization, iteratively generate and refine hypotheses, and probe a reactor simulation with limited sensory feedback. This workflow leverages trial-and-error, persistence, and independent summarization to produce generalizable statements about physical phenomena, demonstrating that LLM agents can autonomously discover nontrivial rules in black-box domains provided sufficient experimental budget and interface structure (Werbrouck et al., 30 Sep 2025).

4. Strategic and Collective Intelligence: Multi-Agent Systems

When deployed as collectives, LLM agents enable dynamic task decomposition, specialization, and robust consensus formation. In LaMAS (LLM-based Multi-Agent Systems), protocols govern structured message exchange, consensus negotiation (including voting and negotiation steps), credit allocation (e.g., Shapley-value partitioning), and experience management (sharing via logs or diff-privacy).

Multi-agent architectures yield several advantages:

  • Flexible specialization: Capabilities can be dynamically matched to subtask requirements, with orchestrators and routers leveraging capability/requirement similarity matrices (Yang et al., 21 Nov 2024).
  • Data privacy: Each agent may encapsulate proprietary memory or context, sharing only necessary state under privacy-preserving schemes (TEEs, HE, differential privacy budgets).
  • Resilient consensus: Byzantine-robust protocols (e.g., DecentLLMs) achieve leaderless agreement by parallel generation and robust geometric median aggregation of answer vectors, tolerating up to nearly half adversarial or faulty agents (Jo et al., 20 Jul 2025).
  • Monetization: Credit allocation protocols can be instrumented to support real-world agent-app marketplaces and incentivization schemes.

This marks a transition from monolithic, single-agent deployments to modular, privacy-preserving, and market-incentivized collective intelligence.

5. Explainability, Evaluation, and Security

Agentic pipelines externalize reasoning via structured artifacts (matrices, game trees, step plans), and deterministic analyzers (for equilibria, role classification, and backward induction) enable traceability and swappability throughout the process. For example, in modular explainable pipelines, each agentic component produces audit-ready outputs, with deterministic code—rather than LLM text generation—handling core numerical or logical reasoning, achieving high human-alignment and rubric-based assessment metrics, with mean alignment nearing 63% on core factors (Pehlke et al., 10 Nov 2025).

Systematic evaluation frameworks for LLM agents span planning, tool use, reflection, and memory via realistic, multi-task benchmarks; dynamic environments (AgentBench, LifelongAgentBench); and multi-agent simulation (Alympics, ShapeLLM in game-theoretic settings) (Yehudai et al., 20 Mar 2025, Zheng et al., 17 May 2025, Mao et al., 2023, Segura et al., 9 Oct 2025).

Security principles remain central as agentic systems introduce new privacy and manipulation threats. AgentSandbox operationalizes defense-in-depth, least privilege, complete mediation, and psychological acceptability in agent ecosystems. Ephemeral agent separation, layered data minimization, and policy-based firewalls reduce attack success rates to as low as 4.3% with minimal utility loss, outperforming ad hoc prompt filters (Zhang et al., 29 May 2025).

6. Applications and Limitations

LLM agents have demonstrated efficiency, robustness, and adaptability in domains including:

Current limitations include context-length constraints, overreliance on prompt engineering, sensitivity to nudges, and the need for memory-augmented or meta-learning architectures to support lifelong adaptation. Experience replay and group self-consistency mechanisms can partially mitigate statelessness but introduce compute and inference costs (Zheng et al., 17 May 2025).

7. Outlook and Research Frontiers

Open challenges and directions for LLM agent research include:

The convergence of modularity, robust orchestration, explainability, and collective intelligence positions LLM agents as foundational to the next generation of adaptive, trustworthy, and auditable computational systems.

Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to LLM Agents.