Papers

Topics

Authors

Recent

View all

Assistant

AI Research Assistant

Well-researched responses based on relevant abstracts and paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses.

Gemini 2.5 Flash

Gemini 2.5 Flash 86 tok/s

Gemini 2.5 Pro 48 tok/s Pro

GPT-5 Medium 23 tok/s Pro

GPT-5 High 22 tok/s Pro

GPT-4o 73 tok/s Pro

Kimi K2 206 tok/s Pro

GPT OSS 120B 431 tok/s Pro

Claude Sonnet 4 38 tok/s Pro

2000 character limit reached

Unravelling the Probabilistic Forest: Arbitrage in Prediction Markets (2508.03474v1)

Published 5 Aug 2025 in cs.CR and q-fin.TR

Abstract: Polymarket is a prediction market platform where users can speculate on future events by trading shares tied to specific outcomes, known as conditions. Each market is associated with a set of one or more such conditions. To ensure proper market resolution, the condition set must be exhaustive -- collectively accounting for all possible outcomes -- and mutually exclusive -- only one condition may resolve as true. Thus, the collective prices of all related outcomes should be \$1, representing a combined probability of 1 of any outcome. Despite this design, Polymarket exhibits cases where dependent assets are mispriced, allowing for purchasing (or selling) a certain outcome for less than (or more than) \$1, guaranteeing profit. This phenomenon, known as arbitrage, could enable sophisticated participants to exploit such inconsistencies. In this paper, we conduct an empirical arbitrage analysis on Polymarket data to answer three key questions: (Q1) What conditions give rise to arbitrage (Q2) Does arbitrage actually occur on Polymarket and (Q3) Has anyone exploited these opportunities. A major challenge in analyzing arbitrage between related markets lies in the scalability of comparisons across a large number of markets and conditions, with a naive analysis requiring $O(2^{n+m})$ comparisons. To overcome this, we employ a heuristic-driven reduction strategy based on timeliness, topical similarity, and combinatorial relationships, further validated by expert input. Our study reveals two distinct forms of arbitrage on Polymarket: Market Rebalancing Arbitrage, which occurs within a single market or condition, and Combinatorial Arbitrage, which spans across multiple markets. We use on-chain historical order book data to analyze when these types of arbitrage opportunities have existed, and when they have been executed by users. We find a realized estimate of 40 million USD of profit extracted.

Summary

The paper develops a formal taxonomy of arbitrage, distinguishing market rebalancing and combinatorial strategies to reveal critical market inefficiencies.
It employs LLMs to detect logical dependencies between market conditions, validating the approach using extensive on-chain data over a year.
The analysis quantifies arbitrage profits exceeding $39M, highlighting the concentration of gains among a few high-frequency actors and persistent market inefficiency.

Arbitrage in Decentralized Prediction Markets: An Empirical Analysis of Polymarket

Introduction

This paper presents a comprehensive empirical paper of arbitrage in decentralized prediction markets, focusing on Polymarket, a leading platform built on the Polygon blockchain. The authors develop a formal taxonomy of arbitrage opportunities, introduce a scalable methodology for detecting market dependencies using LLMs, and quantify both the existence and exploitation of arbitrage across a full year of Polymarket data. The analysis reveals significant inefficiencies, the prevalence of both intra- and inter-market arbitrage, and the emergence of sophisticated arbitrageur strategies, with realized profits exceeding $39 million.

Prediction Market Structure and Arbitrage Taxonomy

Polymarket operates via a hybrid-decentralized CLOB, where each market is defined by one or more binary conditions (e.g., "Will candidate X win?"). Markets are designed to be exhaustive and mutually exclusive, such that the sum of "YES" token prices should equal 1. The authors formalize two primary arbitrage types:

Market Rebalancing Arbitrage: Occurs within a single market when the sum of "YES" prices deviates from 1, enabling risk-free profit by taking long or short positions across all outcomes.
Combinatorial Arbitrage: Emerges between dependent markets when logical relationships between conditions allow for portfolio constructions that guarantee profit, even across distinct but related markets.

The formalism extends to multi-market dependencies, but the analysis is restricted to pairs due to computational and LLM context limitations.

Data Collection and Market Landscape

The dataset comprises all Polymarket markets resolved between April 2024 and April 2025, including both market metadata and on-chain bid histories. The authors employ topic classification (using Linq-Embed-Mistral embeddings) to cluster markets and reduce the combinatorial search space for dependency analysis.

Figure 1: Distribution of markets and conditions by topic and end-date, highlighting the dominance of politics and sports.

Liquidity and trading volume are highly concentrated around major political events, particularly the 2024 U.S. election.

Figure 2: (Top) Total liquidity per market by end date and topic. (Bottom) Executed bid volume over time, with U.S. election markets as the primary activity driver.

Detecting Market Dependencies with LLMs

A key methodological contribution is the use of LLMs (DeepSeek-R1-Distill-Qwen-32B) to infer logical dependencies between market conditions. The approach involves:

Reducing markets with >4 conditions to the top-4 by liquidity plus an "other" catch-all, preserving >90% of liquidity.
Prompting the LLM to enumerate all valid outcome vectors for single and paired markets, checking for exclusivity and dependency.
Filtering for pairs within the same topic and end-date to maximize semantic overlap.
Figure 3: Schematic of the LLM-based pipeline for detecting market dependencies.

The LLM approach is validated on single markets (where dependencies are known by construction) and then extended to pairs. Out of 46,360 market pairs in the U.S. election group, only 13 pairs satisfy the strict combinatorial arbitrage definition, reflecting both the centralized market creation process and LLM context limitations.

Empirical Detection of Arbitrage Opportunities

The authors reconstruct historical order books to compute volume-weighted average prices (VWAP) for each token at block-level granularity. Arbitrage is detected when the sum of relevant token prices deviates from theoretical constraints by at least $0.05, and only during periods of sufficient uncertainty (no token >$0.95).

Figure 4: Time series of VWAP prices and detected arbitrage in the "Will Assad remain President of Syria through 2024?" market.

Single-Condition Arbitrage

Arbitrage within single conditions is widespread, with 7,051 out of 17,218 conditions exhibiting at least one opportunity. The median profit per dollar is approximately $0.60, indicating substantial inefficiency. <img src="https://emergentmind-storage-cdn-c7atfsgud9cecchk.z01.azurefd.net/paper-images/2508-03474/condition_arbitrage_col_num_ops_b_0.95_c_0.02_k_1_colums.png" alt="Figure 5" title="" class="markdown-image" loading="lazy"> <img src="https://emergentmind-storage-cdn-c7atfsgud9cecchk.z01.azurefd.net/paper-images/2508-03474/condition_arbitrage_col_median_dev_b_0.95_c_0.02_k_1_colums.png" alt="Figure 5" title="" class="markdown-image" loading="lazy"> <img src="https://emergentmind-storage-cdn-c7atfsgud9cecchk.z01.azurefd.net/paper-images/2508-03474/condition_arbitrage_col_max_profit_long_b_0.95_c_0.02_k_1_colums.png" alt="Figure 5" title="" class="markdown-image" loading="lazy"> Figure 5: Distribution of arbitrage opportunities within single conditions, with Crypto markets as notable outliers. <img src="https://emergentmind-storage-cdn-c7atfsgud9cecchk.z01.azurefd.net/paper-images/2508-03474/condition_arbitrage_bar_col_max_profit_long_b_0.95_c_0.02_k_1_colums.png" alt="Figure 6" title="" class="markdown-image" loading="lazy"> <img src="https://emergentmind-storage-cdn-c7atfsgud9cecchk.z01.azurefd.net/paper-images/2508-03474/condition_arbitrage_bar_col_max_profit_long_100_b_0.95_c_0.02_k_1_colums.png" alt="Figure 6" title="" class="markdown-image" loading="lazy"> Figure 6: Aggregate arbitrage potential by condition, both uncapped and capped at $100 liquidity. Sports markets show more frequent but smaller opportunities; politics dominates in uncapped profit.

Multi-Condition (NegRisk) Market Arbitrage

Within NegRisk markets, 662 out of 1,578 markets exhibit arbitrage, with both long and short opportunities present. Sports markets are particularly prone to overvaluation, as evidenced by the higher frequency and magnitude of short arbitrage.

Figure 7: Total arbitrage potential across markets, with Sports showing consistent profit, especially in long opportunities.

Combinatorial Arbitrage Across Dependent Markets

Among the 13 dependent market pairs identified, arbitrage opportunities are less frequent and generally lower in liquidity, but still present, especially during periods of market uncertainty.

Figure 8: Distribution of profit per dollar and maximum profit for combinatorial arbitrage in U.S. election market pairs.

Realized Arbitrage and Arbitrageur Behavior

The authors match detected opportunities to user bid histories, grouping transactions within 950-block windows. Approximately 1% of U.S. election arbitrage was exploited, with Sports single markets surpassing politics in realized profit.

Figure 9: Total profit realized by users in single-condition arbitrage, with Sports dominating exploited opportunities.

Rebalancing arbitrage in NegRisk markets is primarily realized in politics, with most opportunities yielding low returns but a few outliers in Crypto, Politics, and Twitter.

Figure 10: (Left) Realized profit from rebalancing arbitrage in NegRisk markets. (Right) Yield distribution per dollar, with most opportunities exhibiting low returns.

The top 10 arbitrageurs extracted over $7 million, with the largest single user realizing$2 million in profit. The majority of profit is concentrated among a small set of highly active, likely automated, accounts.

Figure 11: Log-log plot of number of bids vs. aggregate profit per account, highlighting the concentration of profit among a few high-frequency actors.

Market Microstructure and Liquidity Concentration

Liquidity is highly concentrated in the top-ranked conditions within each market, justifying the reduction to 4+1 conditions for LLM analysis.

Figure 12: Cumulative liquidity distribution by condition rank, with >90% of liquidity in the top 4 conditions.

Discussion and Implications

The paper demonstrates that, despite the theoretical efficiency of prediction markets, substantial arbitrage persists due to non-atomic execution, fragmented liquidity, and the complexity of cross-market dependencies. The use of LLMs for dependency detection is effective but limited by context size and prompt engineering challenges. The realized arbitrage volume, while significant in absolute terms, is modest relative to atomic DeFi markets, reflecting the higher execution risk and lower liquidity in prediction markets.

The findings have several implications:

Market Design: Centralized market creation and resolution processes limit the prevalence of combinatorial arbitrage, but as platforms decentralize, more complex dependencies and arbitrage opportunities are likely to emerge.
LLM Reasoning: Current LLMs are effective for pairwise dependency detection but struggle with larger market sets. Advances in context handling and logical reasoning will be necessary for scaling to higher-order dependencies.
Arbitrageur Strategies: The concentration of profit among a few high-frequency actors suggests the emergence of specialized, possibly automated, arbitrageurs analogous to those in DeFi.
Market Efficiency: The persistence of high-magnitude arbitrage, especially in single-condition and sports markets, indicates that prediction markets remain far from informational efficiency, particularly under non-atomic execution constraints.

Conclusion

This work provides a rigorous empirical foundation for understanding arbitrage in decentralized prediction markets. The combination of formal arbitrage taxonomy, LLM-based dependency detection, and large-scale on-chain data analysis reveals both the structural inefficiencies and the evolving strategies of arbitrageurs in Polymarket. As prediction markets grow in scale and complexity, the methodologies developed here will be essential for monitoring market efficiency, designing robust market mechanisms, and understanding the interplay between human and algorithmic actors in decentralized information aggregation.