DiMo: Multi-Agent Framework for Diverse Thinking
- DiMo is a multi-agent framework that assigns specialized LLM agents to distinct reasoning roles, enabling collaborative and auditable problem-solving.
- It employs iterative debate with role-specific feedback and evidence integration to refine outputs and enhance overall solution accuracy.
- Empirical evaluations demonstrate that DiMo significantly improves benchmarks like GSM8K by effectively balancing divergent creativity with logical precision.
A Multi-Agent Collaboration Framework for Diverse Thinking Modes (DiMo) is an architectural and algorithmic paradigm in which multiple specialized LLM agents interact according to structured protocols, each enacting a distinct cognitive or reasoning style. DiMo is designed to emulate, augment, and scrutinize diversified human‑like problem-solving by orchestrating iterative debate and collaboration among agents with complementary expertise. Through role specification, multi-phase feedback, evidence synthesis, and structured justification, DiMo improves accuracy, interpretability, and robustness over single-agent or non-structured debate baselines. The following sections delineate its architectural structure, agent specializations, debate mechanisms, empirical performance, technical semantics, and applications, based solely on findings in (He et al., 18 Oct 2025).
1. Architectural Structure and Role Specialization
DiMo operationalizes multi-agent collaboration by explicitly assigning four primary LLM agents with distinct, fixed reasoning paradigms:
- Generator: Produces the initial answer, typically with a detailed, step-wise rationale (critical for math benchmarks).
- Evaluator: Assesses the initial response for logical errors, computational or factual mistakes, and general coherence.
- Knowledge Supporter: Retrieves domain-specific supporting evidence, validates facts, and, in Web-native deployments, attaches URL-annotated passages. Ensures the factual basis of the solution (prominent in Divergent Thinking Mode).
- Reasoning Path Provider: Constructs formalized, explicit reasoning chains or logical derivations, supporting or refining previous answers. Additional roles such as Refiner (for targeted error correction) and Judger (for overall consistency) may be instantiated, especially for logical/mathematical reasoning tasks.
DiMo can operate in Divergent and Logical thinking modes. Divergent mode is suited for tasks emphasizing insight, creativity, or wide‑ranging contextual knowledge, driving parallel hypotheses and knowledge integration. Logical mode is tailored to math and formal problem-solving, enforcing stepwise verification and consistency.
2. Iterative Debate and Feedback Mechanism
The defining process in DiMo is a cyclic, multi‑stage debate:
- Generation: The Generator produces an initial answer to input question .
- Evaluation: The Evaluator reviews , identifies errors or inconsistencies .
- Divergent Mode: If flagged, the Knowledge Supporter and Reasoning Path Provider contribute annotated evidence and formal paths respectively.
- These contributions are integrated to update the working solution, i.e., .
- Logical Mode: Upon error detection, the Refiner corrects specific steps; the Judger checks holistic logical coherence (retaining or rejecting modifications based on correctness).
- The loop is repeated for a fixed number of rounds (empirically optimal at three for hard problems), enhancing both solution quality and explicitness of the reasoning chain.
Each agent communicates structured, semantically typed outputs, and the framework maintains all intermediate states through each iteration, culminating in a fully annotated, auditable chain of reasoning.
3. Semantics-Aware, Web-Native Evidence Integration
A salient feature of DiMo is its semantics-aware and Web-native architecture:
- Evidence Chains: Every contribution is semantically tagged (e.g., "Fact," "Supporting Reason," "Corollary") and, in Web-native instances, URL-annotated for downstream validation.
- Retrieval-Augmented Reasoning: The Knowledge Supporter issues queries to corpora or knowledge graphs, returning passages or facts with provenance, allowing integration of up-to-date, verifiable evidence.
- This evidence is incorporated into the reasoning chain at each round, supporting both transparency and external auditability.
Such explicit typing and external validation allow for downstream systems and human users to inspect, challenge, or re-use individual components of the multi-agent reasoning process.
4. Empirical Performance and Interpretability
DiMo demonstrates consistent improvements in both accuracy and interpretability across a variety of benchmarks:
- On mathematics benchmarks such as GSM8K and GSM-hard, LLaMA-3-8B achieves accuracy increases from ~50% (baseline) to over 90% (GSM8K) and from <50% to >70% (GSM-hard).
- Gains are especially pronounced where complex, multi-step logical consistency and explicit error correction are required.
- Output is rendered as a multi-stage justification graph—each node corresponding to a particular agent’s contribution at each debate round—thus making the process fully transparent to audit and intervention.
5. Mathematical and Technical Formalization
DiMo’s agent communications and iterative computation are made explicit in mathematical notation:
- Generation:
- Divergent Mode Update: ; ;
- Logical Mode Process:
- ,
- If (error), , else ,
- Iterative update: ,
- where denotes the number of debate/refinement cycles.
These explicit formulas clarify agent tasking, information routing, and refinement, ensuring each reasoning mode operates as a well‑constrained process.
6. Use Cases and Broader Applicability
The DiMo framework’s structured, auditable multi-agent collaboration is directly applicable across domains where accuracy, transparency, and diverse reasoning strategies are mission-critical:
- Educational Technology: Step-by-step solution justifications and error correction in mathematics tutoring.
- Commonsense and Knowledge Reasoning: Multi-source fact integration and debate for general question-answering.
- Scientific and Legal Domains: Explicit, auditable reasoning chains needed for regulatory compliance, peer review, and critical decision-making.
- Web-Integrated Applications: Retrieval-augmented answers with semantically tagged evidence for digital assistants or automated researchers.
7. Future Directions
Proposed research directions for DiMo include:
- Scaling to Harder Datasets: Extending to the MATH and GPQA benchmarks and integrating symbolic and alternative cognitive reasoning paradigms.
- Task-Adaptive Agent Routing: Dynamically selecting and routing agent roles based on task characteristics.
- Enhanced Knowledge Integration: Further grounding evidence chains via deep integration with web corpora and structured knowledge bases.
- Efficiency and Cost Analysis: Analyzing trade-offs in token and compute cost versus accuracy and interpretability under constrained deployment.
The DiMo framework, by leveraging multi-agent debate with semantically explicit, role-differentiated agents and iterative evidence-backed refinement, establishes a reproducible, extensible method for operationalizing diverse reasoning modes in LLM-based intelligent systems (He et al., 18 Oct 2025).