- The paper introduces semantic non-fungibility as a structural barrier that disrupts the law of one price in prediction markets.
- The methodology employs a robust LLM-based pipeline for event embedding and logical verification, achieving 99.9% recall over 100,000+ events.
- The empirical results show persistent 2–4% price deviations and risk-free arbitrage opportunities, challenging conventional market aggregation theories.
Semantic Non-Fungibility and Price Divergence in Prediction Markets
Introduction
"Semantic Non-Fungibility and Violations of the Law of One Price in Prediction Markets" (2601.01706) offers a rigorous empirical and theoretical account of liquidity fragmentation in contemporary prediction markets. The central thesis is that the absence of a formalized, interoperable event identity precludes price alignment for economically identical contingent claims across platforms. By operationalizing the concept of semantic non-fungibility, the work decomposes market inefficiency into structural barriers rooted in event specification, oracle procedures, and platform design, rather than classical drivers such as informational asymmetries or low liquidity.
Fragmentation Drivers and Market Structure
Prediction markets traditionally function as mechanisms for aggregating dispersed information, with the market price of a binary claim interpreted as a consensus probability. The classical result—parity of YES and NO positions (pY​+pN​=1), and cross-platform Law of One Price—assumes a fungible, atomically transferable asset. The empirical reality diverges sharply: platforms independently define events using natural language, proprietary APIs, or immutable smart contract metadata, each with distinct oracle, cutoff, and exception interpretations. As a result, economically identical claims are rendered semantically non-fungible.
This fragmentation is quantifiable and persistent. The authors introduce a precise taxonomy of event relations, distinguishing between semantic equivalence (identical YES-regions in the atomic outcome space), subset relations (logical implication), and semantic independence. Their cross-platform framework makes these concepts operational via LLM-based event embedding, structural filtering, and logical verification over all major operator-run and decentralized prediction market venues.
A new pipeline is introduced for aligning events across platforms, combining category filtering, high-recall semantic embedding retrieval (OpenAI text-embedding-3-large), and multi-pass LLM-based logical verification to confirm equivalence or subset relationships. The pipeline achieves high recall (99.9% of matches found within top-20 embeddings), low false positivity (<2%), and human annotator agreement (κ=0.94), attesting to its robustness for large-scale ecosystem mapping.
The resulting dataset, exceeding 100,000 events and drawn from a diverse set of platforms (e.g., Kalshi, Polymarket, Omen, Myriad, PredictIt), represents the most comprehensive cross-platform prediction market alignment to date. This data underpins the empirical measurement of semantic overlap, fragmentation, and price divergence.

Figure 1: Longitudinal expansion of independent prediction-market events and the relative market share across platforms, highlighting major regulatory interventions.
Empirical Results: Semantic Fragmentation and Arbitrage
Analysis reveals that approximately 6% of all listed events are semantically equivalent or subset-related across at least two platforms, a proportion that increases significantly for high-visibility, long-lived events. These events span thousands of equivalence classes and subset chains, forming a structured and directionally asymmetric web of relations—especially between major platforms such as Kalshi and Polymarket.
Figure 3: Chord diagram illustrating cross-platform equivalence relations, with arc size proportional to relations per platform and ribbon color encoding original listing direction.
The fragmentation is concentrated temporally and categorically: cross-platform overlap was negligible prior to 2022, but has sharply increased since, with the 2024 U.S. presidential election as a pivotal inflection point. Political events constitute the majority, but there is increasing duplication in finance, crypto, and other high-salience domains.
Figure 5: Temporal evolution of semantically matched market pairs, categorized by event type and platform.
Execution-Aware Price Divergence
Contrary to parity expectations, execution-aware price deviations for semantically equivalent or subset-related markets are systematic and persistent. Even after adjusting for platform-specific spreads, fees, slippage, and tick size, the median deviation for high-liquidity pairs remains 2–4% throughout event lifetimes. Maximum deviations often exceed 7%, and risk-free arbitrage opportunities with annualized yields of several hundred percent persist for hours or days, particularly in the largest markets.
Figure 2: Price deviation from theoretical equilibrium versus effective cross-platform liquidity, highlighting arbitrage persistence even at high liquidity.
These deviations are not explainable via informational or liquidity deficits. Instead, they are rooted in structural barriers: capital and enforceability constraints prevent the atomic offset of positions across platforms. Arbitrageurs must commit capital over protracted, uncertain resolutions since no platform supports enforceable netting, and there is no atomic swap for event-contingent positions (contrasting with fungible tokens where DeFi infrastructure enables rapid cross-chain price equalization).
Figure 4: Relationship between liquidity and (left) worst-case annualized arbitrage yield and (right) temporal persistence of execution-aware arbitrage opportunities.
Case Study: 2024 U.S. Presidential Election
The 2024 U.S. election, the deepest and most liquid contemporary prediction market, exemplifies these dynamics. Prices for Trump-YES claims diverged by up to 7 cents between Polymarket and Kalshi for extended intervals, despite both markets referencing the same real-world event and benefiting from nearly perfect, common knowledge information—differences were driven by platform-specific resolution semantics (election-night call vs. inauguration) and jurisdictional segmentation, not by belief heterogeneity or liquidity.
Figure 6: Execution-adjusted Trump-YES price divergence across Kalshi and Polymarket, contrasted with realized arbitrage yields under alternative settlement horizons.
An atomic arbitrage—YES on Polymarket (superset) and NO on Kalshi (subset)—yields deterministic profits regardless of outcome, but only at the cost of significant capital lockup and legal risk. Regulatory segmentation further impedes efficient price discovery, but is not the principal source of divergence.
Implications and Theoretical Consequences
The findings directly challenge the assumption that prediction markets are universal information aggregators. Without enforceable, machine-verifiable event identity, local liquidity pools and price signals dominate. Arbitrage is fundamentally limited by semantic non-fungibility: while fungible assets (e.g., ERC-20 tokens) exhibit rapid price convergence via DeFi and cross-chain protocols, prediction market claims lack any such infrastructure due to individualized event specification, oracle divergence, and absence of atomic cross-platform execution.
The failure of law-of-one-price is thus structural, not incidental. No global "market probability" exists for major events—the best that can be claimed is platform-specific, locally consistent pricing. This has both practical (disincentivizing large-scale capital allocation and risk management) and theoretical (delegitimizing the notion of ecosystem-level forecast aggregation) ramifications.
Prospects for Semantic Interoperability and Future Research
The analysis demonstrates the necessity of a shared canonical event specification layer—akin to ISINs for equities or ERC-20 addresses for tokens—for meaningful cross-platform information integration and arbitrage enforcement. Potential directions include:
- Standardized Event Ontologies: Formal schemas for event specification, enabling atomic cross-platform resolution mapping.
- Verifiable Event Oracles: Shared outcome sources and semantics, strengthening logical equivalence across market implementations.
- Atomic Arbitrage Protocols: Cryptoeconomic mechanisms for position swapping and netting contingent on cross-venue event resolution, possibly leveraging advances in cross-chain atomicity (Öz et al., 28 Jan 2025).
Until such foundations are established, cross-platform information aggregation in prediction markets will remain fragmented and arbitrage-constrained.
Conclusion
The work provides a comprehensive account of how semantic non-fungibility generates persistent, economically significant cross-platform price divergences in prediction markets, even in highly liquid and efficient settings. By constructing a cross-venue semantic aligner and large-scale empirical dataset, the authors map the structure and magnitude of fragmentation, quantify the limits of arbitrage, and demonstrate that the absence of enforceable event identity is the principal barrier to price convergence. The results indicate that local, not global, information aggregation prevails, and that addressing semantic non-fungibility is prerequisite for the canonical vision of universally efficient prediction markets to be realized.