SOLOMON Multi-LLM-Agent Framework

Updated 25 July 2025

SOLOMON is a neuro-inspired multi-agent system that orchestrates parallel LLMs for adaptable, multi-step reasoning in complex tasks.
It employs explicit belief state representations and reflective assessment to mitigate planning errors and hallucinations.
The framework enables decentralized task execution and emergent Theory of Mind, optimizing performance in diverse applications.

The SOLOMON Multi-LLM-Agent Reasoning Framework is a neuro-inspired architecture designed to enhance the adaptability and reasoning capacity of LLMs by orchestrating collaborative multi-agent interactions. Initially introduced for domain-specific applications such as semiconductor layout design (Wen et al., 5 Feb 2025), subsequent research has established its broader significance in enabling complex multi-step reasoning, Theory of Mind (ToM) inference, decentralized task execution, and scalable performance through explicit agent coordination, memory augmentation, and reflective evaluation (Li et al., 2023, Michelman et al., 7 Mar 2025). The framework integrates diverse reasoning paths, structured feedback mechanisms, and explicit state representations to address challenges in long-horizon planning, hallucination, and effective inter-agent collaboration.

1. Architectural Foundations

The SOLOMON framework is built upon a layered neuro-inspired reasoning network (Wen et al., 5 Feb 2025) comprising three principal components:

Thought Generators: A pool of heterogeneous LLMs operates in parallel, each producing candidate reasoning paths or "thoughts" for a target task. This design draws on diverse language-model biases and leverages retrieval-augmented generation (RAG) to supplement domain-specific information.
Thought Assessor: A dedicated evaluation module ("judge") receives aggregated outputs from the thought generators, error logs, and code or task-specific feedback. Using in-context learning and reflection (such as self-reflection strategies), the assessor identifies inconsistencies, corrects errors, and reduces hallucinations. Its decisions are guided by a goal-oriented objective based on the Free Energy Principle,

$F = D_{KL}(q(s)\,||\,p(s,o))\,,\qquad F = \mathbb{E}_q[\ln q(s) - \ln p(s,o)],$

with $q(s)$ as the agent’s belief over possible states and $p(s,o)$ the joint distribution over states and observations.

Steering Subsystem: This human-in-the-loop module employs prompt engineering to adjust instruction and guidance for the generator and assessor components, facilitating rapid adaptation to new domains and changing objectives without retraining the underlying models.

Parallel work (Li et al., 2023) demonstrates that SOLOMON-style systems, operating in fully decentralized, zero-shot settings, exhibit emergent collaborative behavior, spontaneous leadership role assumption, information sharing, and resolution of coordination problems in multi-agent cooperative tasks.

SOLOMON agents demonstrate high-order Theory of Mind (ToM) capacities by inferring both their own beliefs and those of teammates, even reasoning recursively about others’ beliefs concerning their own knowledge (Li et al., 2023). The evaluation distinguishes:

Introspection: Agents articulate mental state and knowledge (e.g., room or object states).
First-order ToM: Agents infer what another agent knows, based on position or shared history.
Second-order ToM: Agents deduce whether another agent is aware of the first agent’s knowledge, based on communication history.

Experiments show that, with explicit belief state representations, agents reach near 70% accuracy in second-order ToM inference, confirming that SOLOMON’s architecture can approximate both direct observation and subtle team-interaction cues.

3. Planning Optimization and Belief State Representations

Despite their strengths, LLM-based agents suffer from systematic planning failures, most notably:

Long-horizon context problems: Agents may lose track of relevant facts as context length increases, leading to invalid or low-quality actions.
Hallucination: Actions may be generated based on false or incomplete beliefs about the task state.

To mitigate these issues, SOLOMON introduces structured, text-based belief state representations. Each agent maintains a concise state summary, such as:

$\textbf{Belief State}: \begin{array}{rl} \text{Role:}&\text{Player } i \ \text{Current Round:}& t \ \text{Team Score:}& S \ \text{Room:}&\text{Room } r \ \text{Bomb Intel:}&\{\text{Bomb 1: state},\, \ldots\} \end{array}$

This formulation ensures the persistence of critical facts and significantly reduces both reasoning errors and hallucination, resulting in decreased task completion rounds and improved valid action rates.

4. Collaborative Learning, Memory, and Aggregation Strategies

SOLOMON employs multi-agent collaboration coupled with dynamic memory structures for robust learning and reasoning (Michelman et al., 7 Mar 2025). Core mechanisms include:

Self-consistency: Multiple identical agents (with shared context/exemplar sets) produce parallel outputs, aggregated using majority voting.
Varied-context agents: Each agent samples distinct exemplars from a shared or learned memory bank, increasing diversity in reasoning paths and often outperforming single-agent or strictly self-consistent ensembles.
Summarizer agent: Instead of voting, a summarizer reasons over the collective outputs and chains-of-thought, producing a synthesized response; this is particularly valuable when base agents are weak.

Memory banks (frozen or continuously learned) provide in-context exemplars. Random retrieval methods often yield higher accuracy than similarity-based retrieval, as they expose the system to more diverse reasoning styles and avoid the trap of redundant, narrowly focused exemplars.

5. Multi-Agent Coordination and Communication Protocols

The framework supports fully decentralized interaction as well as structured communication protocols. Agents can voluntarily assume roles such as leader or critic, share mission-critical facts, and broadcast or request help (Li et al., 2023). Key findings include:

Zero-shot emergent coordination: Agents learn team behavior and leadership roles without explicit training or central critics.
Robust inter-agent information flow: With explicit belief states and message routing, agents resolve ambiguity and maintain global awareness despite partial observability.
Token and efficiency trade-offs: Scaling up the number of agents or the diversity of input contexts improves reasoning, but with increased computational and token costs (Xu et al., 12 May 2025).

Optimal communication protocols may involve sequential information passing, where each agent receives the full reasoning trace of its immediate predecessor and only the summarized outputs from previous rounds, avoiding information overload and redundancy.

6. Practical Applications, Performance, and Future Directions

SOLOMON’s architecture is validated across complex, domain-specialized applications such as semiconductor layout design (Wen et al., 5 Feb 2025), multi-agent cooperative text games (Li et al., 2023), and formal logic benchmarks (Michelman et al., 7 Mar 2025). Demonstrated performance gains include:

Reduced runtime and scaling errors compared to standalone LLMs.
Near state-of-the-art results in competitive reasoning benchmarks.
Enhanced generalization and task robustness via modular role and memory assignment.

Future research directions include:

Hierarchical multi-level SOLOMON configurations to support more advanced, multi-stage reasoning.
Integration of enhanced multimodal linking for combined code, text, and visual input evaluation.
Iterative, feedback-driven self-improvement mechanisms.
Expansion to additional domains such as financial modeling and power grid optimization.

SoloMON differs from contemporary frameworks as follows:

Framework	Role Assignment	Intervention Point	Memory/State Design
SOLOMON	Thought generator/assessor, flexible role-taking	Prompt and output aggregation; belief state management	Explicit, structured belief state; dynamic memory bank
RR-MP (He et al., 31 Dec 2024)	Reactive and reflection agents (fixed dual roles)	Pathwise post-hoc summarization	Path-level dialogue memory
ReMA (Wan et al., 12 Mar 2025)	Hierarchical meta-thinking/reasoning agents	Multi-agent RL, strategic oversight	Joint RL-driven meta-thought memory
MASTER (Gan et al., 24 Jan 2025)	Tree-structured agent graphs via MCTS	LLM-based validation/assessment	Inherited context with dynamic expansion
SynergyMAS (Kostka et al., 2 Jul 2025)	Specialized roles, ToM-enriched collaboration	Graph+logic + RAG feedback	Neo4j+Clingo+vector memory, explicit “My Beliefs”

SOLOMON’s emphasis on explicit belief states, multi-agent self-reflection, and in-context memory distinguishes it from frameworks oriented strictly toward debate, reward optimization, or rigid role division.

SOLOMON embodies a flexible, modular approach to multi-agent LLM reasoning, leveraging neuro-inspired design, structured inter-agent memory, reflective assessment, and dynamic role allocation to produce robust, adaptive solutions in complex and ambiguous problem domains. The integration of Theory of Mind elements, explicit belief state tracking, and scalable communication protocols positions SOLOMON as a foundational framework for practical, collaborative AI reasoning systems in both research and application domains.