Suggester–Editor Agents
- Suggester–Editor Agents are modular frameworks that decouple the creative suggestion phase from the rigorous editing process using specialized LLM-based agents.
- They employ strictly defined JSON protocols and multi-round iterations to ensure accurate communication, systematic evaluation, and quality control in tasks such as annotation, writing, and code repair.
- Empirical studies show that incorporating an Editor stage significantly boosts output metrics, as seen with gains in F1 scores and repair resolution rates across various domains.
Suggester–Editor Agents are modular multi-agent frameworks in which the proposal and vetting of candidate outputs are structurally separated into two (or more) specialized LLM-based agents. The “suggester” (also termed Annotator, Improver, Suggester Agent, or Viewer) is responsible for proposing initial outputs given a task input, while the “editor” (Review Agent, Reviewer, Fixer, Editor Agent) evaluates, critiques, and often refines these suggestions. This division, formalized across domains such as linguistic annotation, writing assistance, business insight generation, and automated code repair, is shown to enhance both throughput and output quality by simulating a professional peer-review or collaborative workflow (Li, 5 Feb 2026, Chu et al., 28 Dec 2025, Hou et al., 2 Dec 2025, Bhandari et al., 17 Jan 2026, Zhang et al., 27 Feb 2026, Zhang et al., 28 Apr 2026).
1. Architectural Principles and Agent Roles
In all domains, Suggester–Editor Agent systems instantiate at least two explicit agent roles with well-defined communication interfaces, typically realized through strict JSON schemas or protocol specification. The Suggester receives an input—raw task data, code snippets, document sections, or review clusters—and outputs a proposed answer or candidate edit. The Editor, operating in parallel or as a subsequent step, receives the original input plus the suggester’s proposal, and issues a critique, revision, and/or quality-graded output. This architecture is exemplified by:
- LinguistAgent: Annotator (Suggester) emits annotated text and reasoning trace; Reviewer (Editor) flags errors and issues a revised annotation (Li, 5 Feb 2026).
- PaperDebugger: Suggester proposes patch objects and explanations over LaTeX; Editor validates/applies patches, updating version control and document hashes (Hou et al., 2 Dec 2025).
- SGAgent: Suggester aggregates bug context and synthesizes repair plans; Fixer (Editor) produces executable patches justified with textual rationale (Zhang et al., 27 Feb 2026).
- SWE-Edit: The main agent (Suggester, Editor's term) formulates natural-language modification plans, the Editor subagent renders executable code diffs (Zhang et al., 28 Apr 2026).
This delineation preserves orthogonality between creative ideation and conservative curation, with communication enforced by machine-parseable message formats.
2. Formal Agent Interfaces and Communication Protocols
Agent interfaces employ strongly typed, machine-verifiable schemas to guarantee correct handoff and facilitate autonomous orchestration and evaluation. Canonical examples include:
| Agent Role | Input Fields | Output Fields |
|---|---|---|
| Suggester | Task input (text, code, selection), system/user prompt, | Proposed output (patches, annotations, suggestions), |
| optional context (retrieved codebook, exemplars, KB) | rationale or reasoning trace | |
| Editor | Original input plus Suggester’s output, base version | Critique, revised output, application status, diff summary |
In LinguistAgent, Annotator and Reviewer each return JSON records with separated fields for suggestion, reasoning, critique, and revised annotation. PaperDebugger employs a Model Context Protocol (MCP) with Pydantic/Protobuf schemas for patch proposals and application acknowledgments (Hou et al., 2 Dec 2025). SGAgent agents coordinate through explicit JSON-like schemas with handshakes and completion signals, with every pipeline output tagged by a task ID (Zhang et al., 27 Feb 2026).
Typically, the protocol proceeds in deterministic one-round or optionally multi-round steps:
- Suggester/Annotator proposes.
- Editor/Reviewer evaluates and revises.
- Final output is evaluated against ground truth or external criteria.
In reflective pipelines (e.g., iterative advice refinement (Bhandari et al., 17 Jan 2026)), multiple suggestion-critique cycles can be triggered until a quantitative threshold is met.
3. Methodological Instantiations Across Domains
Suggester–Editor frameworks generalize across heterogeneous tasks, each integrating auxiliary retrieval, evaluation, or ranking subsystems.
- Linguistic Annotation: LinguistAgent enables multi-paradigm operation (Zero/Few-shot Prompting, Retrieval-Augmented Generation, Fine-tuning). The Suggester–Editor loop remains invariant, with only the construction of agent prompts (ℓ_A, ℓ_R) varying by paradigm (Li, 5 Feb 2026).
- Editing and Revision: In PaperDebugger, the Suggester agent can invoke multiple tools (language polish, structural critique, literature search) and emits patch objects, while the Editor agent ensures correct application and versioning directly within Overleaf (Hou et al., 2 Dec 2025).
- Business Advice: When distilling customer reviews, the Suggester clusters input data and generates first-draft actionable advice; the Editor applies a structured rubric (SRAC) to critique and iteratively refine these outputs, providing domain-aligned improvement feedback (Bhandari et al., 17 Jan 2026).
- Job Referral Requests: The Suggester (Improver) uses prompt-driven LLM calls (optionally RAG-augmented) to optimize text, while the Editor (Evaluator) is an L1-regularized classifier trained via LoRA on outcome data. Iterative retriever and explainer components assign ratings and steer generation granularity (Chu et al., 28 Dec 2025).
- Software Engineering: In both SGAgent and SWE-Edit, code inspection, planning, and editing are fully decoupled: diagnostics, context gathering, and plan formation are suggester tasks; atomic edit execution and formatting are the responsibility of the editor agent (Zhang et al., 27 Feb 2026, Zhang et al., 28 Apr 2026).
This separation allows independent scaling, fine-tuning, and evaluation of each agentic module as task or system requirements evolve.
4. Quantitative Evaluation and Empirical Outcomes
Metrics are application-specific but share common structural features: evaluate post-Editor (and, when relevant, post-Suggester) output against a ground-truth or quality model, measure pre/post improvement, and where applicable, decompose the incremental effect of the Editor stage.
| Application | Metric(s) | Suggester Alone | Suggester–Editor | Absolute Gain |
|---|---|---|---|---|
| LinguistAgent | (metaphor ID) | 0.5070 | 0.5753 | +13.4% |
| Job Referral (RAG) | Success rate (weak reqs) | 0.392 | 0.447 | +14% |
| SGAgent (SWE-Bench) | Repair resolved % | 38.0% | 51.3% | +13.3pp |
| SWE-Edit (SWE-bench) | Patch resolved % | 69.9% | 72.0% | +2.1pp |
| Business Advice | Composite rubric | varies | – | Lower output variance, quality parity with LLM baselines |
Token-level scoring, strict classification (e.g., , , ), calibrated reward models, and downstream regression/test validation are employed. In ablation experiments, the insertion of an Editor (Reviewer/Explainer) step consistently yields significant gains, especially for lower-quality or more ambiguous initial proposals (Li, 5 Feb 2026, Chu et al., 28 Dec 2025, Zhang et al., 27 Feb 2026). Cost and efficiency metrics—such as inference cost per issue and patch application error rate—are explicitly tracked in SWE-Edit, evidencing greater edit reliability and resource economy (Zhang et al., 28 Apr 2026).
5. Implementation Practices and Adaptability
Best practices for Suggester–Editor system engineering comprise:
- Massive separation of roles: Suggester refrains from “over-correcting,” ceding all downstream changes to the Editor, which enforces high fidelity to workflow constraints (Li, 5 Feb 2026).
- Strongly typed/JSON schema output: Segregation of generated versus decision/evaluation fields enables transparent scoring and error traceability (Hou et al., 2 Dec 2025, Li, 5 Feb 2026, Chu et al., 28 Dec 2025).
- Composite or modular workflows: Pipelines can embed auxiliary agents (e.g., retrievers, explainers, ranking LLMs) to further partition reasoning, evidence retrieval, and feedback propagation (Chu et al., 28 Dec 2025, Bhandari et al., 17 Jan 2026).
- Iterative prompt hyperparameter tuning: Feedback from Editor outputs and live evaluation can inform suggestion prompt design and sampling strategies (Li, 5 Feb 2026).
- Extension to new domains: By adapting structured output tags or payloads (e.g., from <Metaphor> spans to <Claim> or <PER>), the same agentic loop is ported to sequence labeling, question answering, or action recommendation (Li, 5 Feb 2026, Bhandari et al., 17 Jan 2026).
- Evaluation checklists and schema-aware logging: Versioning, robust conflict detection (e.g., patch offsets in PaperDebugger), and root-cause diagnostics for API or format failures are essential for reliable deployment (Hou et al., 2 Dec 2025).
Domain-specific limitations are also acknowledged: external reward models may introduce proxy gaps, and system norms may drift, degrading formerly reliable exemplars or success predictors (Chu et al., 28 Dec 2025).
6. Generalization and Theoretical Implications
The Suggester–Editor (or suggestion–edit) paradigm is shown to align with classical separation-of-concerns principles in software engineering and to embody aspects of reflective human-in-the-loop review. In code editing domains, the Viewer–Editor specialization in SWE-Edit is a direct analog, enabling scalable and context-efficient reasoning by the main agent and delegating brittle, error-prone formatting operations to a compact, trainable editor subagent (Zhang et al., 28 Apr 2026).
A plausible implication is that as LLM-based agentic systems proliferate, the Suggester–Editor decomposition will form a reusable backbone for highly interleaved, multi-modal reasoning tasks wherever both creative proposal and rigorous curation are required. The modularity allows for the replacement or compositional extension of either role, independent advances in agent models, and systematic cost–quality optimization.
7. Summary Table of Suggester–Editor Systems (Selected Papers)
| System | Domain | Suggester Role | Editor Role | Empirical Gain |
|---|---|---|---|---|
| LinguistAgent | Linguistics | Token/Span tagging, reasoning | Critique, revise, score (F₁) | +13.4% (Li, 5 Feb 2026) |
| PaperDebugger | Writing | Generate patch, rationale | Apply patch, version control | N/A |
| SGAgent | Code Repair | Context+plan synthesis | Patch synthesis, validation | +13.3pp resolved (Zhang et al., 27 Feb 2026) |
| SWE-Edit | SWE | Plan edit (main agent) | Render, apply diff | +2.1pp resolved (Zhang et al., 28 Apr 2026) |
| Chu & Huang (2025) | Writing | Rewrite request (LLM) | Predict success (reward model) | +14% on weak reqs (Chu et al., 28 Dec 2025) |
| Business Advice | Informatics | Draft, refine recommendations | Rubric-based critique/refine | Output consistency (Bhandari et al., 17 Jan 2026) |
The Suggester–Editor agent design is a general, empirically validated architectural template for decomposing complex reasoning and editing workflows in LLM-driven systems, adaptable to a wide class of sequential reasoning, annotation, and generation tasks across disciplines.