Agentic Proposing in AI Systems

Updated 10 February 2026

Agentic Proposing is defined as an iterative process where agents use internal world models and dynamic goals to generate and refine proposals.
Methodologies employ sequential decision processes and agentic dialogue frameworks with specific design patterns for proposal validation, scoring, and execution.
Applications extend to autonomous problem synthesis, collaborative decision-making, and software engineering, demonstrating significant empirical performance gains.

Agentic proposing is a core construct in contemporary research on agentic AI systems, denoting the mechanism by which agents dynamically generate, evaluate, and refine candidate actions or artifacts in response to evolving goals, context, and feedback. The agentic proposing paradigm underpins advances in autonomous problem synthesis, decision discourse, compositional reasoning, and software engineering with agentic AI, and is increasingly formalized both within system-theoretic agent architectures and data-centric frameworks for the training of LLMs (Dao et al., 27 Jan 2026, Jiao et al., 3 Feb 2026, Dolant et al., 16 Feb 2025, Hoda, 22 Oct 2025).

1. Formal Definitions and General Architecture

Agentic proposing formally describes the iterative process by which one or more agents, interfacing with their own internal world models and objectives, construct new proposals—course-of-action candidates, problem instances, decisions, or code artifacts—and submit these for self-execution and/or peer evaluation. In systems-theoretic terms, this operation is captured by a proposal-generation function $D$ , defined as:

$\pi_t = D(W_t, G_t)$

where $W_t$ is the agent’s world model at time $t$ , $G_t$ is its active goal set, and $\pi_t$ are the generated proposals. This process is embedded within architectures decomposing the agent into interacting subsystems—for example, Perception & Grounding, Reasoning & World Model, Action Execution, Inter-Agent Communication, and Learning & Adaptation—each contributing distinct phases to the propose-and-refine pipeline (Dao et al., 27 Jan 2026).

In multi-agent or dialogue-based settings, agentic proposing is orchestrated across agent assemblies, each agent endowed with internal state $S_i^t = (C_i^t, M_i^t, P_i, O_i )$ capturing conversational context, memory, persona vector, and weighted objectives (Dolant et al., 16 Feb 2025). Agents propose and score actions, adapt dialogue structure, and trigger expert summoning or reconfiguration based on mutual information across their knowledge subspaces.

2. Methodologies and Workflow Realizations

Agentic proposing is operationalized via sequential decision processes, dialogue orchestration, and pattern-driven agent architectures:

Sequential Decision Processes: Problem synthesis is cast as a goal-driven partially observable Markov decision process (POMDP), where at each step the proposing policy $\pi_\theta$ composes modular skills and selects from cognitive, tool, or submission actions. Observable states include active skill libraries, history buffers, and stage indicators (Draft/Check/Refine) (Jiao et al., 3 Feb 2026).
Agentic Dialogue and Adaptive Assembly: Message-driven frameworks employ micro-scale orchestration to select speakers, extract proposals, and govern summoning or retirement of agentic experts in response to redundancy or unique knowledge metrics. Proposals are scored by agent-specific utility functions and consensus criteria, and the group dynamically halts proposing on convergence checks (Dolant et al., 16 Feb 2025).
Design Patterns: Twelve agentic design patterns undergird robust agentic proposing, including Integrator (input validation), Retriever (context memory), Planner (plan decomposition), Selector (proposal prioritization), Deliberator (action selection), Executor (proposal realization), Reflector (feedback-driven adaptation), and Skill Build (macro-move extraction). These archetypes structure proposal generation, execution, feedback, and learning flows, and modularize the propose-critique-act-learn cycle (Dao et al., 27 Jan 2026).

Subsystem	Primary Patterns Involved	Proposing Role
Perception & Grounding	Integrator	Validates/filters inputs
Reasoning & World Model	Retriever, Planner, Selector, Deliberator	Plans, prioritizes, selects proposals
Action Execution	Executor, Tool Use	Runs proposals, collects feedback
Learning & Adaptation	Reflector, Skill Build	Refines future proposals
Inter-Agent Communication	Coordinator	Negotiates multi-agent plans

3. Agentic Proposing in Synthetic Reasoning and Data Generation

In LLM-centric problem synthesis, agentic proposing becomes a reinforcement learning–driven process where an agent, equipped with a modular skill library, iteratively drafts, reflects, verifies, prunes, and submits problems:

Workflow: The agent cycles through phases: skill selection, draft, internal reflection (chain-of-thought), tool invocation (e.g., symbolic computation, code execution), dynamic skill pruning, refinement, and external verification. Each step is aligned with cognitive and interactive actions, all situated in the POMDP formalism (Jiao et al., 3 Feb 2026).
Learning Objective: Multi-Granularity Policy Optimization (MGPO) is introduced to densify credit assignment across both single-stage and trajectory-level proposal quality, with rewards shaped by logical soundness and verifiability. The agent’s proposing policy is trained to maximize long-term return subject to KL-regularization against a reference, producing high-quality synthetic training data with state-of-the-art downstream solver performance.
Empirical Results: Proposers trained via this framework substantially improve cross-domain LLM solver performance. For instance, a 30B parameter solver trained on only 11,000 agent-synthesized trajectories achieves 91.6% accuracy on AIME25, with dynamic tool-use and reflection yielding an aggregate +6.8 accuracy improvement and problem validity climbing from 68.7% to 82.3% (Jiao et al., 3 Feb 2026).

4. Collaborative Deliberation and Decision Discourse

Agentic proposing plays a central role in group decision processes, particularly where agents model distinct personas and objectives:

Persona-Driven Proposing: Each agent possesses a persona vector encoding risk attitude, equity emphasis, and domain expertise, and an objective function mapping proposals to multi-dimensional utilities. Dialogue is governed by speaker selection, message extraction, and adaptive self-governance—agents summon new expertise to address knowledge gaps or redundancy.
Evaluation and Consensus: Agents exchange and critique proposals, scoring them by the multi-objective utility $U_i(a)$ . Group-level consensus incorporates both average utility and a disagreement penalty:

$U_\mathrm{group}(a) = \frac{1}{N} \sum_{i=1}^N U_i(a) - \lambda \operatorname{Var}[U_1(a),...,U_N(a)]$

Where necessary, assemblies reconfigure until convergence or coverage criteria are met (e.g., unique knowledge thresholds).

Deliberation Metrics: Group deliberative effectiveness can be quantified by proposal synergy (partial information decomposition), coherence metrics, and convergence statistics, all emergent from the proposing interplay (Dolant et al., 16 Feb 2025).

5. Applications in Software Engineering and Systems Design

Agentic proposing informs end-to-end workflows in agentic software engineering (SE):

Agentic SE Vision: Agentic SE extends proposing from code generation and debugging to requirements engineering, ethical alignment, architectural design, deployment, and operations. Multi-actor cycles of proposing and reviewing are embedded across all life cycle phases, with shifting levels of human and agent autonomy (Hoda, 22 Oct 2025).
Socio-Technical Challenges: Practical adoption must address the social embedding of proposing agents—including trust, transparency, accountability, and organizational alignment—as well as vocabulary precision and value alignment (e.g., by adhering to the Comprehensive, Responsible, Adaptive, Foundational, and Translational (CRAFT) value set) (Hoda, 22 Oct 2025).
Pattern-Patching Legacy Pipelines: Traditional “Thought-Act” cycles in agent frameworks such as ReAct are augmented with modular proposing patterns, leading to improved robustness, finer-grained reasoning, and reliable proposal execution (Dao et al., 27 Jan 2026).

6. Advantages, Limitations, and Future Directions

Agentic proposing offers several documented advantages:

Data and Reasoning Efficiency: High-quality synthetic data generated via agentic proposing matches or surpasses that of massive human-curated datasets in reasoning-intensive domains (Jiao et al., 3 Feb 2026).
Compositional and Modular Flexibility: Proposing frameworks exploit modular skills and design patterns to enable compositional generalization, rapid troubleshooting, and domain adaptability (Jiao et al., 3 Feb 2026, Dao et al., 27 Jan 2026).
Collaborative Synergy: Multi-agent proposing increases the emergence of novel strategies and enhances group-level decision robustness by optimizing synergy and consensus metrics (Dolant et al., 16 Feb 2025).

Documented limitations include the cost and complexity of training proposer agents, dependence on robust verification infrastructure, and the requirement for expert-curated skill and pattern libraries (Jiao et al., 3 Feb 2026). Open lines of research emphasize self-evolving skill libraries, co-proposing agent ensembles (Teacher/Solver/Proposer configurations), interdisciplinary benchmarks for ethics and sustainability, and education as well as translation into actionable guidelines (Hoda, 22 Oct 2025).

7. Representative Case Studies

Problem Synthesis: Agentic-proposer agents generate advanced mathematical problem instances—in one example, synthesizing an ODE-sequence-algebra intersection problem, validating steps via symbolic tools, and correcting draft inconsistencies through internal chain-of-thought reflection (Jiao et al., 3 Feb 2026).
Flood Disaster Planning: Assembly of diverse persona-driven agents propose, critique, and converge on actionable disaster response strategies, dynamically summoning domain experts and quantifying both coherence and synergy along the way (Dolant et al., 16 Feb 2025).
Patching ReAct: Introducing propositional validation, contextual retrieval, and feedback-driven refinement mechanisms remedies traditional monolithic action cycles, reducing hallucination and brittleness in LLM agents (Dao et al., 27 Jan 2026).

Agentic proposing thus constitutes a theoretically grounded and empirically validated paradigm, central to robust, adaptable, and collaborative agentic AI systems spanning data generation, decision discourse, and complex software engineering. Its continued refinement is projected to drive advances in reliability, compositionality, and human-AI coevolution across domains.