Papers
Topics
Authors
Recent
Search
2000 character limit reached

Semantic Non-Fungibility and Violations of the Law of One Price in Prediction Markets

Published 5 Jan 2026 in cs.CE | (2601.01706v1)

Abstract: Prediction markets are designed to aggregate dispersed information about future events, yet today's ecosystem is fragmented across heterogeneous operator-run platforms and blockchain-based protocols that independently list economically identical events. In the absence of a shared notion of event identity, liquidity fails to pool across venues, arbitrage becomes capital-intensive or unenforceable, and prices systematically violate the Law of One Price. As a result, market prices reflect platform-local beliefs rather than a single, globally aggregated probability, undermining the core information-aggregation function of prediction markets. We address this gap by introducing a semantic alignment framework that makes cross-platform event identity explicit through joint analysis of natural-language descriptions, resolution semantics, and temporal scope. Applying this framework, we construct the first human-validated, cross-platform dataset of aligned prediction markets, covering over 100 000 events across ten major venues from 2018 to 2025. Using this dataset, we show that roughly 6% of all events are concurrently listed across platforms and that semantically equivalent markets exhibit persistent execution-aware price deviations of 2-4% on average, even in highly liquid and information-rich settings. These mispricings give rise to persistent cross-platform arbitrage opportunities driven by structural frictions rather than informational disagreement. Overall, our results demonstrate that semantic non-fungibility is a fundamental barrier to price convergence, and that resolving event identity is a prerequisite for prediction markets to aggregate information at a global scale.

Summary

  • The paper introduces semantic non-fungibility as a structural barrier that disrupts the law of one price in prediction markets.
  • The methodology employs a robust LLM-based pipeline for event embedding and logical verification, achieving 99.9% recall over 100,000+ events.
  • The empirical results show persistent 2–4% price deviations and risk-free arbitrage opportunities, challenging conventional market aggregation theories.

Semantic Non-Fungibility and Price Divergence in Prediction Markets

Introduction

"Semantic Non-Fungibility and Violations of the Law of One Price in Prediction Markets" (2601.01706) offers a rigorous empirical and theoretical account of liquidity fragmentation in contemporary prediction markets. The central thesis is that the absence of a formalized, interoperable event identity precludes price alignment for economically identical contingent claims across platforms. By operationalizing the concept of semantic non-fungibility, the work decomposes market inefficiency into structural barriers rooted in event specification, oracle procedures, and platform design, rather than classical drivers such as informational asymmetries or low liquidity.

Fragmentation Drivers and Market Structure

Prediction markets traditionally function as mechanisms for aggregating dispersed information, with the market price of a binary claim interpreted as a consensus probability. The classical result—parity of YES and NO positions (pY+pN=1p_Y + p_N = 1), and cross-platform Law of One Price—assumes a fungible, atomically transferable asset. The empirical reality diverges sharply: platforms independently define events using natural language, proprietary APIs, or immutable smart contract metadata, each with distinct oracle, cutoff, and exception interpretations. As a result, economically identical claims are rendered semantically non-fungible.

This fragmentation is quantifiable and persistent. The authors introduce a precise taxonomy of event relations, distinguishing between semantic equivalence (identical YES-regions in the atomic outcome space), subset relations (logical implication), and semantic independence. Their cross-platform framework makes these concepts operational via LLM-based event embedding, structural filtering, and logical verification over all major operator-run and decentralized prediction market venues.

Cross-Platform Semantic Alignment: Methodology and Data

A new pipeline is introduced for aligning events across platforms, combining category filtering, high-recall semantic embedding retrieval (OpenAI text-embedding-3-large), and multi-pass LLM-based logical verification to confirm equivalence or subset relationships. The pipeline achieves high recall (99.9% of matches found within top-20 embeddings), low false positivity (<2%), and human annotator agreement (κ=0.94\kappa=0.94), attesting to its robustness for large-scale ecosystem mapping.

The resulting dataset, exceeding 100,000 events and drawn from a diverse set of platforms (e.g., Kalshi, Polymarket, Omen, Myriad, PredictIt), represents the most comprehensive cross-platform prediction market alignment to date. This data underpins the empirical measurement of semantic overlap, fragmentation, and price divergence. Figure 1

Figure 1

Figure 1: Longitudinal expansion of independent prediction-market events and the relative market share across platforms, highlighting major regulatory interventions.

Empirical Results: Semantic Fragmentation and Arbitrage

Analysis reveals that approximately 6% of all listed events are semantically equivalent or subset-related across at least two platforms, a proportion that increases significantly for high-visibility, long-lived events. These events span thousands of equivalence classes and subset chains, forming a structured and directionally asymmetric web of relations—especially between major platforms such as Kalshi and Polymarket. Figure 2

Figure 3: Chord diagram illustrating cross-platform equivalence relations, with arc size proportional to relations per platform and ribbon color encoding original listing direction.

The fragmentation is concentrated temporally and categorically: cross-platform overlap was negligible prior to 2022, but has sharply increased since, with the 2024 U.S. presidential election as a pivotal inflection point. Political events constitute the majority, but there is increasing duplication in finance, crypto, and other high-salience domains. Figure 4

Figure 5: Temporal evolution of semantically matched market pairs, categorized by event type and platform.

Execution-Aware Price Divergence

Contrary to parity expectations, execution-aware price deviations for semantically equivalent or subset-related markets are systematic and persistent. Even after adjusting for platform-specific spreads, fees, slippage, and tick size, the median deviation for high-liquidity pairs remains 2–4% throughout event lifetimes. Maximum deviations often exceed 7%, and risk-free arbitrage opportunities with annualized yields of several hundred percent persist for hours or days, particularly in the largest markets. Figure 6

Figure 2: Price deviation from theoretical equilibrium versus effective cross-platform liquidity, highlighting arbitrage persistence even at high liquidity.

These deviations are not explainable via informational or liquidity deficits. Instead, they are rooted in structural barriers: capital and enforceability constraints prevent the atomic offset of positions across platforms. Arbitrageurs must commit capital over protracted, uncertain resolutions since no platform supports enforceable netting, and there is no atomic swap for event-contingent positions (contrasting with fungible tokens where DeFi infrastructure enables rapid cross-chain price equalization). Figure 7

Figure 4: Relationship between liquidity and (left) worst-case annualized arbitrage yield and (right) temporal persistence of execution-aware arbitrage opportunities.

Case Study: 2024 U.S. Presidential Election

The 2024 U.S. election, the deepest and most liquid contemporary prediction market, exemplifies these dynamics. Prices for Trump-YES claims diverged by up to 7 cents between Polymarket and Kalshi for extended intervals, despite both markets referencing the same real-world event and benefiting from nearly perfect, common knowledge information—differences were driven by platform-specific resolution semantics (election-night call vs. inauguration) and jurisdictional segmentation, not by belief heterogeneity or liquidity. Figure 8

Figure 6: Execution-adjusted Trump-YES price divergence across Kalshi and Polymarket, contrasted with realized arbitrage yields under alternative settlement horizons.

An atomic arbitrage—YES on Polymarket (superset) and NO on Kalshi (subset)—yields deterministic profits regardless of outcome, but only at the cost of significant capital lockup and legal risk. Regulatory segmentation further impedes efficient price discovery, but is not the principal source of divergence.

Implications and Theoretical Consequences

The findings directly challenge the assumption that prediction markets are universal information aggregators. Without enforceable, machine-verifiable event identity, local liquidity pools and price signals dominate. Arbitrage is fundamentally limited by semantic non-fungibility: while fungible assets (e.g., ERC-20 tokens) exhibit rapid price convergence via DeFi and cross-chain protocols, prediction market claims lack any such infrastructure due to individualized event specification, oracle divergence, and absence of atomic cross-platform execution.

The failure of law-of-one-price is thus structural, not incidental. No global "market probability" exists for major events—the best that can be claimed is platform-specific, locally consistent pricing. This has both practical (disincentivizing large-scale capital allocation and risk management) and theoretical (delegitimizing the notion of ecosystem-level forecast aggregation) ramifications.

Prospects for Semantic Interoperability and Future Research

The analysis demonstrates the necessity of a shared canonical event specification layer—akin to ISINs for equities or ERC-20 addresses for tokens—for meaningful cross-platform information integration and arbitrage enforcement. Potential directions include:

  • Standardized Event Ontologies: Formal schemas for event specification, enabling atomic cross-platform resolution mapping.
  • Verifiable Event Oracles: Shared outcome sources and semantics, strengthening logical equivalence across market implementations.
  • Atomic Arbitrage Protocols: Cryptoeconomic mechanisms for position swapping and netting contingent on cross-venue event resolution, possibly leveraging advances in cross-chain atomicity (Öz et al., 28 Jan 2025).

Until such foundations are established, cross-platform information aggregation in prediction markets will remain fragmented and arbitrage-constrained.

Conclusion

The work provides a comprehensive account of how semantic non-fungibility generates persistent, economically significant cross-platform price divergences in prediction markets, even in highly liquid and efficient settings. By constructing a cross-venue semantic aligner and large-scale empirical dataset, the authors map the structure and magnitude of fragmentation, quantify the limits of arbitrage, and demonstrate that the absence of enforceable event identity is the principal barrier to price convergence. The results indicate that local, not global, information aggregation prevails, and that addressing semantic non-fungibility is prerequisite for the canonical vision of universally efficient prediction markets to be realized.

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We found no open problems mentioned in this paper.

Authors (2)

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 12 tweets with 40 likes about this paper.