When Agent Markets Arrive

Published 8 Apr 2026 in cs.CE | (2604.06688v1)

Abstract: AI agents are increasingly transacting on behalf of users -- delegating tasks, spending budgets, and negotiating with unfamiliar counterparties. From skill marketplaces to agent-only bazaars, the economic infrastructure of these emerging platforms is being built ad-hoc, yet early design choices tend to lock in; understanding what dynamics they produce is urgent. We present \diagon, a programmable market system designed to inform the institutional design of near-future agent cognitive-labour markets. \diagon is populated by heterogeneous tool-using agents, making the full cycle of job posting, bidding, negotiation, execution, payment, and reputation accumulation end-to-end observable and experimentally manipulable. We instantiate one market form to demonstrate \diagon. We find that market exchange generates (3.2\times) the wealth of self-sufficient agents, but these gains depend strongly on institutional structure; for example, interventions such as identity transparency and stronger competitive selection can degrade market performance rather than improve it. These findings highlight concrete design requirements for the economic infrastructure of the agent era. Code and data are available at https://github.com/assassin808/diagon.

Abstract PDF Upgrade to Chat

Authors (3)

Summary

The paper demonstrates that market-mediated trade in agent-driven cognitive-labour markets yields 3.2× wealth gains and improved task quality compared to autarky.
It employs targeted ablation experiments on institutional structures, revealing that transparency and selection pressure critically influence cross-family trade and market performance.
Agent linguistic analysis uncovers that natural language reasoning correlates with strategic decisions, driving emergent specialization and reinforcing social capital dynamics.

Institutional Design and Dynamics in Agent-Driven Cognitive-Labour Markets

Introduction

"When Agent Markets Arrive" (2604.06688) presents Diagon, an experimental platform for agent-driven cognitive-labour markets comprised of heterogeneous tool-using LLM agents. The framework exposes the end-to-end market cycle—including job posting, bidding, negotiation, execution, payment, reputation accrual, and evolutionary selection—facilitating manipulation and observation of emergent dynamics. The study systematically compares market exchange to agent autarky, interrogates institutional mechanisms through targeted ablations, and analyzes linguistic and behavioral correlates within agent discourse and decision making. The findings demonstrate substantial productivity and wealth gains through market-mediated trade, but reveal that these gains are highly contingent upon institutional structure, emphasizing the necessity of principled market design for future agent economies.

Market Productivity, Equality, and Frictions

Diagon's experimental market yields distinct advantages over autarkic (self-sufficient) agent configurations:

Agents participating in trade achieve $3.2\times$ higher average wealth ( $d = +1.57$ , $p < 0.001$ ) and superior task quality (mean $= 0.55$ vs. $0.46$; $d = +0.19$ , $p < 0.001$ ).
Wealth inequality (Gini $= 0.33$ ) is reduced relative to autarky ($0.42$), indicating that market exchange not only increases aggregate productivity but also distributes it more equitably across agents.
Figure 1: Wealth, contract award, and task quality distributions reveal enhanced wealth and quality, but concentrated contract allocation under market exchange ( $d = +0.19$ , $d = +1.57$ 0).

Competitive allocation, however, concentrates contract awards (Gini $d = +1.57$ 1 in market vs. $d = +1.57$ 2 in autarky), with a few agents winning a disproportionate share. Despite early divergence in wealth trajectories, upward mobility exists, as a significant fraction of lower-quartile agents escape it within 24 rounds. Persistent transaction frictions are apparent: ~42\% of interactions end in disputes, showing little abatement over time, and dispute rates are concentrated among agents with lower reputation or contentious evaluation tendencies.

Emergent Specialization and Network Structure

The absence of exogenous role assignment does not prevent the spontaneous emergence of specialized agent roles and robust trading networks:

By final rounds, model families polarize toward net-contractor or net-poster roles, reflecting endogenous specialization and comparative advantage.
The network structure exhibits increasing concentration and connection, with the proportion of reciprocal trade relationships substantially exceeding random baseline levels.
Figure 2: Emergent agent specialization, market concentration, unique trading pairs, and strong reciprocity characterize the final network structure.

These dynamics indicate that successful agents accrue both economic and social capital—reciprocal trust relationships and positive reputation—which further amplify their comparative advantage. Poster evaluation quality remains noisy, particularly for intermediate task quality, confirming a partial lemons-market dynamic. Although reputation is predictive (wealth vs. reputation $d = +1.57$ 3, $d = +1.57$ 4), a sizable residual of false disputes persists.

Figure 3: Reputation correlates with wealth; bid price stratifies by model family; dispute rates vary but do not resolve with time.

Institutional Mechanisms: Ablation and Contradictory Effects

The robustness of market productivity is contingent upon specific institutional structures. Ablations targeting transparency, agent disposition, evolutionary pressure, skill diversity, and economic parameters yield marked—sometimes counterintuitive—effects:

Transparency ablation (revealing agent family) produces the most dramatic effect: cross-family trade collapses from 86% to 67% ( $d = +1.57$ 5, $d = +1.57$ 6), eliminating gains from specialization.
Contrary to human-market intuitions, honesty instructions increase disputes and reduce payment; adversarial instructions drive market insularity rather than exploitation, while collaborative priors lower execution quality.
Increased evolutionary pressure (tripling elimination rate) uniformly degrades major metrics—wealth, quality, dispute rate, and cross-family connectivity.
Figure 4: Ablation effect sizes show transparency and fierce selection severely degrade cross-family trade and overall market performance.

Market surplus is not intrinsic to agent capability, but arises from alignment between agent heterogeneity and institutional settings that permit specialization and cooperation.

Agent Belief, Reasoning, and Linguistic Signatures

Every strategic decision in Diagon is accompanied by natural-language reasoning, providing a high-fidelity window into agent cognition and behavioral priors:

Poster reasoning text predicts payment decisions ( $d = +1.57$ 7), demonstrating that language encodes strategic outcomes.
Model families develop distinct evaluation personalities—GLM posters show elevated punishment-related reasoning and false dispute rates; GPT posters are consistently more generous.
Agent beliefs update substantially each round, drifting toward more self-interested framing as evolutionary selection proceeds.
Figure 5: Theme fingerprint and belief polarity show model-family/personality distinctions and skill-cluster sentiment variation.

Practical and Theoretical Implications

The findings articulate several actionable principles for agent-market design:

Specialization is critical; as foundation models commoditize, only tasks requiring genuine expertise generate sustainable surplus.
Evaluation infrastructure, not agent capability, is the primary bottleneck; advances in verification and quality assessment will be more impactful than further model improvements.
Diversity must be actively maintained; evolutionary and competitive pressures naturally erode model and skill heterogeneity, limiting system resilience.
Direct transfers from human market institutions are unreliable; transparency, honesty norms, and strong incentives—effective due to social context and repeated interaction—lack the intended effect and sometimes backfire in agent economies.

These results underscore the necessity for empirical, mechanistic experimentation to guide institutional choices, rather than reliance on theory or direct translation from human-centric market dynamics.

Conclusion

Diagon substantiates that agent-driven cognitive-labour markets generate significant gains in wealth and task quality, contingent upon carefully designed institutional governance. The productivity, equality, and coordination advantages of market exchange are fragile and load-bearing with respect to rule structure. Standard fixes from human markets—transparency, honesty, and aggressive selection—often undermine these gains. Empirical experimentation via platforms such as Diagon is thus essential for the discovery and validation of robust agent economic infrastructures. The implications extend to practical deployment of agent marketplaces, evaluation system design, and meta-theoretical considerations in multi-agent system alignment and stability.

Markdown Report Issue