Unravelling the Probabilistic Forest: Arbitrage in Prediction Markets
Abstract: Polymarket is a prediction market platform where users can speculate on future events by trading shares tied to specific outcomes, known as conditions. Each market is associated with a set of one or more such conditions. To ensure proper market resolution, the condition set must be exhaustive -- collectively accounting for all possible outcomes -- and mutually exclusive -- only one condition may resolve as true. Thus, the collective prices of all related outcomes should be \$1, representing a combined probability of 1 of any outcome. Despite this design, Polymarket exhibits cases where dependent assets are mispriced, allowing for purchasing (or selling) a certain outcome for less than (or more than) \$1, guaranteeing profit. This phenomenon, known as arbitrage, could enable sophisticated participants to exploit such inconsistencies. In this paper, we conduct an empirical arbitrage analysis on Polymarket data to answer three key questions: (Q1) What conditions give rise to arbitrage (Q2) Does arbitrage actually occur on Polymarket and (Q3) Has anyone exploited these opportunities. A major challenge in analyzing arbitrage between related markets lies in the scalability of comparisons across a large number of markets and conditions, with a naive analysis requiring $O(2{n+m})$ comparisons. To overcome this, we employ a heuristic-driven reduction strategy based on timeliness, topical similarity, and combinatorial relationships, further validated by expert input. Our study reveals two distinct forms of arbitrage on Polymarket: Market Rebalancing Arbitrage, which occurs within a single market or condition, and Combinatorial Arbitrage, which spans across multiple markets. We use on-chain historical order book data to analyze when these types of arbitrage opportunities have existed, and when they have been executed by users. We find a realized estimate of 40 million USD of profit extracted.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Explain it Like I'm 14
Overview
This paper studies a website called Polymarket, where people trade “Yes” and “No” shares about future events, like “Will Team A win?” or “Will Candidate B become president?” The price of a share acts like the crowd’s best guess of the chance that something will happen.
The authors look for mistakes in how these shares are priced. When prices don’t make sense, smart traders can lock in a risk-free profit. That’s called arbitrage. The paper explains two kinds of arbitrage that can happen on Polymarket, shows how often they occur, and estimates how much money people made from them.
What were they trying to find out?
The paper asks three simple questions:
- Q1: What kinds of situations create arbitrage in prediction markets?
- Q2: Do these arbitrage opportunities actually appear on Polymarket?
- Q3: Has anyone taken advantage of them in real life?
How did they study it?
The authors combined math, computer tools, and real trading data.
Here’s the idea in everyday language:
- Markets and conditions: A “market” is a big question, and its “conditions” are the possible answers. For example, “Who will win?” might have three conditions: Team A wins, Team B wins, or a tie. Only one of these can be true in the end.
- Prices should add to 1: If you add up the “Yes” prices for all the conditions in a single market, they should total 1 (which means 100%). If the sum is less than 1, you can buy a bit of everything and guarantee a profit when the event resolves. If it’s more than 1, you can profit by selling or shorting in the right way.
- Related markets: Sometimes different markets talk about the same event in different ways. For example, one market says “Team A or Team B wins,” while another says “Team A wins by 2 or more.” These are linked. If Team A wins by 2 or more, then Team A must also have won, period. The authors checked for these links between markets.
- Scaling the search: There are thousands of markets. Comparing every pair of markets in every possible way would take forever. So the authors reduced the search by:
- Grouping markets that end on the same date.
- Grouping by topic (like Politics or Sports).
- Using text similarity to find markets that talk about the same event.
- Using AI (LLMs): They used a LLM (an AI text assistant) to read market descriptions and spot logical relationships. Think of it like a very careful reader that says, “If this statement is true, that statement must also be true,” and writes down all the possible combinations that make sense. They also did “prompt engineering,” which means they carefully designed the instructions given to the AI so it would answer in a useful, structured way.
- On-chain data: Polymarket records matched trades on the blockchain, which is a public, tamper-resistant database. The authors downloaded these records to see when and how trades happened, and to catch moments when arbitrage would have been possible or actually happened.
What did they find?
The authors found two main types of arbitrage:
- Market Rebalancing Arbitrage (inside one market): If the total “Yes” prices for all outcomes in a single market add up to less than 1, you can buy a little of every outcome and guarantee a profit when one of them wins. If they add up to more than 1, you can use a different strategy to profit from overpricing. This corrects the market and pushes prices back toward sensible values.
- Combinatorial Arbitrage (across multiple related markets): If two markets are logically linked (like “Team A wins” and “Team A wins by 2+”), you can build a combo of positions across both that guarantees at least one will pay off. The profit comes from the mismatch between how the two markets are priced.
Why it matters:
- These mispricings do happen on Polymarket.
- People exploit them. By analyzing the blockchain records, the authors estimate that around 40 million USD of profit was made from these kinds of arbitrage during the year they studied (April 2024 to April 2025).
Why are these results important?
- Price accuracy: Arbitrage helps fix prices. When traders spot a mistake and trade against it, they push prices back to better reflect true probabilities. That makes the market’s forecasts more reliable.
- Real-world impact: During big events (like the 2024 U.S. elections), Polymarket had huge trading volume. If prices are accurate, the market’s odds can be a useful signal for the public, journalists, and decision-makers.
- Technology and fairness: On blockchains, arbitrage can be fast and complex. Skilled traders may capture most of the profits. Understanding these patterns helps platforms and users think about fairness and design better systems.
What does this mean for the future?
- Better market design: Platforms can add guardrails to reduce mispricing, improve how related markets are built, and make price consistency easier.
- Smarter monitoring: The AI approach the authors used shows that we can automatically flag linked markets and find errors early. That could help platforms, regulators, or researchers watch for problems in real time.
- Trade-offs: Arbitrage is often considered helpful because it corrects prices. But there’s ongoing debate about whether specialized traders capture too much value compared to regular users. Designing systems that are both efficient and fair remains a challenge.
In short, the paper shows that arbitrage in prediction markets is real, measurable, and significant. By combining smart math, AI reading of market text, and blockchain data, the authors mapped out when and how mispricings happen—and how much money they can generate.
Collections
Sign up for free to add this paper to one or more collections.




