Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 66 tok/s
Gemini 2.5 Pro 48 tok/s Pro
GPT-5 Medium 21 tok/s Pro
GPT-5 High 30 tok/s Pro
GPT-4o 91 tok/s Pro
Kimi K2 202 tok/s Pro
GPT OSS 120B 468 tok/s Pro
Claude Sonnet 4.5 35 tok/s Pro
2000 character limit reached

Determination of the fifth Busy Beaver value (2509.12337v1)

Published 15 Sep 2025 in cs.LO, cs.FL, and math.LO

Abstract: We prove that $S(5) = 47,176,870$ using the Coq proof assistant. The Busy Beaver value $S(n)$ is the maximum number of steps that an $n$-state 2-symbol Turing machine can perform from the all-zero tape before halting, and $S$ was historically introduced by Tibor Rad\'o in 1962 as one of the simplest examples of an uncomputable function. The proof enumerates $181,385,789$ Turing machines with 5 states and, for each machine, decides whether it halts or not. Our result marks the first determination of a new Busy Beaver value in over 40 years and the first Busy Beaver value ever to be formally verified, attesting to the effectiveness of massively collaborative online research (bbchallenge$.$org).

Summary

  • The paper presents the first formal proof that S(5) equals 47,176,870 via exhaustive enumeration in Coq.
  • It details a robust decider pipeline—from loops detection to NGramCPS and WFAR—that processes all 181,385,789 unique Turing machines.
  • The collaborative approach, with over 27,000 lines of Coq code and formal proofs, sets a new benchmark in verifying uncomputable function boundaries.

Determination of the Fifth Busy Beaver Value: A Formal, Collaborative Milestone

The paper "Determination of the fifth Busy Beaver value" (2509.12337) presents the first formal proof that S(5)=47, ⁣176, ⁣870S(5) = 47,\!176,\!870, where S(n)S(n) denotes the maximum number of steps a halting nn-state, 2-symbol Turing machine can execute from the all-zero tape. This result, obtained via a massive collaborative effort and formalized in the Coq proof assistant, marks a significant advance in the paper of uncomputable functions and the boundaries of algorithmic decidability. The work also establishes S(2,4)S(2,4) and reconfirms S(4)S(4), providing a comprehensive, formally verified landscape of small Busy Beaver values. Figure 1

Figure 1

Figure 1: 5-state 2-symbol Busy Beaver winner. This machine was discovered by Marxen and Buntrock in 1989 and is now formally proven to be the maximal halting 5-state 2-symbol Turing machine.

Theoretical Context and Motivation

The Busy Beaver function, introduced by Rado in 1962, is a canonical example of a simple yet uncomputable function. S(n)S(n) is uncomputable because, if it were computable, it would yield a solution to the halting problem for nn-state Turing machines. Historically, only S(1)S(1) through S(4)S(4) had been established, with S(5)S(5) remaining open for over four decades. The challenge is twofold: the combinatorial explosion in the number of machines and the inherent undecidability of the halting problem, which, for larger nn, can encode arbitrarily complex mathematical statements.

Formalization and Enumeration: Tree Normal Form

A central technical contribution is the exhaustive enumeration of all 5-state, 2-symbol Turing machines in Tree Normal Form (TNF), which eliminates unreachable transitions and state/symbol permutations. This reduces the search space from 1.67×10131.67 \times 10^{13} to 181, ⁣385, ⁣789181,\!385,\!789 unique machines. The TNF enumeration is implemented directly in Coq, ensuring that no machine is omitted and that the enumeration itself is formally verified. Figure 2

Figure 2

Figure 2: Transition table of the 2-state 4-symbol Busy Beaver winner found by Ligocki and Ligocki in 2005, also formally verified in this work.

Decider Pipeline: Automated and Formal Halting Analysis

The proof employs a multi-stage pipeline of deciders—algorithms that attempt to decide the halting status of a Turing machine. The pipeline is as follows:

  1. Loops: Detects machines that enter periodic or spatially translated cycles (Cyclers and Translated Cyclers).
  2. NGramCPS: Uses nn-gram-based local configuration analysis, with alphabet augmentations (fixed-length history, LRU) to capture more complex behaviors.
  3. RepWL: Employs regular expressions over repeated tape blocks to generalize and close over infinite families of configurations.
  4. FAR/WFAR: Utilizes (weighted) finite automata to construct regular or nonregular over-approximations of reachable configurations, providing certificates of nonhalting.
  5. Individual Proofs: For 13 "Sporadic Machines" not captured by automated deciders, bespoke Coq proofs are constructed.

This pipeline, implemented and verified in Coq, decides the halting status of all 181, ⁣385, ⁣789181,\!385,\!789 machines, with only 13 requiring manual intervention. Figure 3

Figure 3

Figure 3: Space-time diagrams of the first 45 steps of a Cycler, illustrating a machine that eventually repeats the same configuration forever.

Figure 4

Figure 4: 10,000-step space-time diagram of a Translated Cycler not decided by the decider for loops in Coq-BB5 (it is decided by NGramCPS).

Figure 5

Figure 5: 10,000-step space-time diagram of a "fractal-looking" 5-state Turing machine that is solved by the LRU augmentation but has no known solution with standard NGramCPS or the fixed-length history augmentation.

Formal Verification in Coq: Proof by Reflection

The entire enumeration, decider pipeline, and verification are implemented in Coq, leveraging proof by reflection. This approach ensures that both the algorithms and their correctness proofs are machine-checked. The Coq development comprises over 27,000 lines of code and 638 lemmas, with an additional 10,000 lines for imported proofs. The proof compiles in under an hour on commodity hardware, demonstrating the feasibility of large-scale formal verification for combinatorial problems.

Handling Irregular and Pathological Machines

While the majority of machines are handled by regular CTL-based deciders, a small set of "irregular" machines require nonregular techniques (WFAR) or individual analysis. Notably, the 13 Sporadic Machines exhibit behaviors such as extremely long pre-periods before entering cycles (e.g., 5.4×10515.4 \times 10^{51} steps), double Fibonacci counters, and obfuscated Gray code transformations. These cases highlight the diversity and complexity of behaviors even in small Turing machines. Figure 6

Figure 6

Figure 6

Figure 6

Figure 6

Figure 6

Figure 6

Figure 6

Figure 6

Figure 6

Figure 6

Figure 6

Figure 6

Figure 6: Family picture of the 5-state Sporadic Machines (20,000-step space-time diagrams) which required individual Coq nonhalting proofs.

Implications for Computability and Mathematical Logic

The formal determination of S(5)S(5) provides a concrete boundary for the knowable in the context of Busy Beaver values. For n>5n > 5, the halting problem for nn-state machines can encode open mathematical conjectures (e.g., Goldbach, Riemann), and for sufficiently large nn, even the consistency of ZF set theory. The work identifies "Cryptids"—machines whose halting status is believed to be mathematically hard, such as the 6-state "Antihydra" machine, which encodes a Collatz-like problem.

Collaborative and Open Research Model

The project was conducted as a massively collaborative, open research effort via bbchallenge.org, involving hundreds of contributors, most without academic affiliation. The infrastructure included a Discord server, a wiki, and a public codebase, with contributions in a wide range of programming languages. The collaborative model facilitated rapid development, cross-verification, and the integration of diverse expertise, culminating in a formally verified result.

Scaling, Performance, and Future Directions

The Coq-based approach demonstrates that large-scale, combinatorial proofs can be both feasible and trustworthy. The enumeration and verification pipeline is parallelizable and can be further optimized. However, the exponential growth in the number of machines and the undecidability barrier imply that S(6)S(6) and beyond are likely intractable with current methods, especially given the presence of Cryptids. Figure 7

Figure 7

Figure 7

Figure 7

Figure 7

Figure 7

Figure 7: Main zoological families that were identified among 5-state Turing machines, together with Cyclers and Translated Cyclers.

Conclusion

This work establishes S(5)=47, ⁣176, ⁣870S(5) = 47,\!176,\!870 and S(2,4)S(2,4) via a fully formal, collaborative, and reproducible process, setting a new standard for rigor in the paper of uncomputable functions. The integration of advanced decider pipelines, formal verification, and open collaboration provides a template for future research at the interface of computability, logic, and large-scale formal mathematics. The dataset and methods developed here will serve as benchmarks for future AI reasoning systems and as a foundation for further exploration of the boundaries of algorithmic knowledge.

Ai Generate Text Spark Streamline Icon: https://streamlinehq.com

Explain it Like I'm 14

What is this paper about?

This paper proves a famous number in computer science called the fifth Busy Beaver value: S(5) = 47,176,870. That means they showed that among all tiny, 5‑state “Turing machines” that start on a blank tape and eventually stop, the one that runs the longest takes exactly 47,176,870 steps before halting. They didn’t just test machines: they created a formal, computer-checked proof using a tool called Coq, so other scientists can trust it completely.

What questions did the researchers want to answer?

  • What is the exact maximum number of steps any 5‑state, 2‑symbol Turing machine can take before it stops, starting from an all‑zero (blank) tape?
  • Can we verify this answer in a way that’s guaranteed correct, using a proof assistant (a program that checks math proofs)?
  • Can collaborative online research and careful programming solve a problem that many people thought would be too hard?

How did they do it? (Methods explained simply)

Think of a Turing machine as a very tiny robot sitting on an infinite strip of paper (the “tape”). Each square on the tape has a symbol (like 0 or 1). The robot has a small “state” inside it (like A, B, C, …). At every step, it: 1) reads the symbol under its head, 2) writes a new symbol, 3) moves left or right, 4) switches to a new state, following its rule table.

The Busy Beaver game asks: among all such robots with exactly n states and 2 symbols, which one runs the longest before it stops? That longest time is called S(n).

To find S(5), the team:

  • Generated all the relevant 5‑state, 2‑symbol machines, carefully avoiding duplicates using a smart recipe called “Tree Normal Form.” This shrank the search from roughly 16 quadrillion possibilities down to about 181 million.
  • For each of these ~181,385,789 machines, they tried to decide: will it halt (stop) or run forever? Simply simulating step-by-step is sometimes enough—but not always, because “runs forever” can be tricky to prove.
  • They built and proved correct a toolbox of “deciders.” A decider is like a detective: it reasons about a machine’s behavior without needing to simulate forever. Their main framework was called Closed Tape Language (CTL), which, in simple terms, creates a safe “fence” around all the kinds of tape patterns a machine can reach and shows that none of those patterns lead to halting. If the fence is closed under the machine’s rules and contains no halting point, the machine can’t halt.
  • A very small number of especially complicated machines (13 of them) needed careful, custom proofs (“Sporadic Machines”). The team wrote detailed arguments for each and checked them in Coq.
  • They performed all of this inside the Coq proof assistant. Coq is like a super‑strict math teacher that won’t accept any steps unless they’re proven. They wrote the algorithms, proved those algorithms are correct, ran them, and let Coq certify the final results. This approach is called “proof by reflection.”

They also used a community platform (bbchallenge.org) where many contributors shared ideas, code, and checks. This collaboration helped design better deciders and speed up progress.

What did they find and why does it matter?

Main results:

  • They proved S(5) = 47,176,870. That means the known 5‑state champion machine really is the longest‑running one before stopping.
  • This is the first new Busy Beaver value determined in over 40 years.
  • It’s the first Busy Beaver value ever to be formally verified by a proof assistant, which is a big deal for trust and reproducibility in mathematics and computer science.
  • They also formally verified several other small Busy Beaver values (including previous ones) and solved another class: the case with 2 states and 4 symbols.

Why it’s important:

  • Busy Beaver numbers grow extremely fast and are connected to deep ideas, like the “halting problem” (the impossibility of having a program that always decides if any program will halt).
  • By proving S(5) with fully checked methods, the team showed that large, tricky, exploratory proofs can be done in a reliable way.
  • Their methods and tools can be used to push the frontier for bigger cases (like 6‑state machines), where the problems start to touch famous unsolved math questions.

What does this mean for the future?

  • The approach—smart enumeration, powerful deciders, and formal verification—creates a strong foundation for tackling harder Busy Beaver cases.
  • For 6‑state machines and beyond, some specific machines look “cryptic” (the paper calls them “Cryptids”): deciding whether they halt may be as hard as long‑standing open problems in math. This makes the Busy Beaver game an exciting way to generate new, meaningful challenges.
  • The fully checked dataset and methods are great testbeds for AI systems that try to do mathematical reasoning.
  • The project also shows how large, open, online collaborations can successfully produce serious research, similar to open‑source software—just with math proofs.

In short, this work determined S(5) exactly, proved it in a way everyone can trust, and opened doors to exploring even deeper problems where simple‑looking machines can hide surprising complexity.

Ai Generate Text Spark Streamline Icon: https://streamlinehq.com

Knowledge Gaps

Knowledge gaps, limitations, and open questions

Below is a consolidated list of unresolved issues the paper highlights or implies, framed to be concrete and actionable for future researchers.

  • Complete, fully verified TNF enumeration for 6-state 2-symbol machines is unfinished; scale the Coq implementation to cover ~33 billion machines and formally guarantee no winners are missed.
  • Determine exact values for S(6), S(3,3), and S(2,5); current status includes substantial holdouts and suspected champions but no formal resolutions.
  • Prove or refute halting for identified 6-state “Cryptids,” especially:
    • Antihydra: settle whether its Collatz-like odd/even imbalance conjecture implies non-halting from all-zero tape.
    • “BMO Problem 1” machine: establish whether there exists an index i with a_i = b_i (equivalently, that the machine halts).
    • The probviously halting 3-state 3-symbol machine expected to vastly extend current lower bounds: provide a rigorous halting proof and resulting exact S(3,3) value.
  • Develop CTL deciders (and other techniques) that systematically subsume the 13 Sporadic Machines, eliminating the need for bespoke, machine-specific proofs in similar future pipelines.
  • Generalize and formally verify all deciders used informally by the community (e.g., FAR) within Coq, and quantify their coverage and failure modes across larger machine classes.
  • Provide a rigorous methodology to turn “probviously” (probabilistic heuristic) non-halting/halting arguments into formal proofs (e.g., via supermartingales, drift conditions, or invariants), and benchmark their success on Cryptids.
  • Establish nontrivial upper bounds for S(6) (beyond decidability barriers), or produce verified acceleration techniques capable of detecting halting beyond current champions.
  • Complete a formal, quantitative coverage analysis of CTL: for each decider, measure what proportion of the TNF search space it decides, characterize undecided patterns, and derive new deciders targeted at specific residual families.
  • Produce a comprehensive, formal taxonomy (“zoology”) of 5-state machine behaviors (e.g., counters, Gray-code generators, large chaotic pre-loopers), including precise invariants, templates, and automated recognition tools.
  • Characterize all 5-state counters: give a complete classification of counter architectures achievable within 5 states, with correctness proofs and tight bounds on their step counts and tape growth.
  • Find and formally prove the maximal loop length among 5-state machines without halting transitions, and develop scalable methods to certify enormous eventual loops (e.g., >1051 steps) without full simulation.
  • Investigate universality at 5 states in this model: either construct a 5-state universal Turing machine or prove impossibility under the paper’s conventions (2 symbols, bi-infinite tape, undefined-transition halting).
  • Extend TNF completeness proofs and enumeration machinery to multi-symbol classes beyond (2,4), ensuring that reductions do not exclude potential winners in classes like (3,3), (3,4), and (4,3).
  • Systematically paper Busy Beaver values under alternative models discussed (quadruple TMs, turmites, lambda calculus): formalize translations, preserve semantics, and compute or bound corresponding S(n) variants.
  • Explore Busy Beaver values from non-all-zero initial tapes (and other inputs): define precise variants, develop enumeration/decider pipelines, and determine whether small-state universality or extreme behaviors arise.
  • Cross-verify Coq-BB5 in other proof assistants (e.g., Lean, Isabelle) to strengthen trust, and aim for a proof that avoids additional axioms (e.g., remove reliance on functional_extensionality_dep via alternative encoding).
  • Provide human-understandable mechanistic explanations for extremely large halters (including the 5-state winner), distilling repeated macro-configurations and proving macro-step lemmas that generalize to larger classes.
  • Improve parallelization, caching, and certified acceleration in Coq for large-scale enumeration/proof-by-reflection workloads; document resource footprints needed to scale to 6-state and 3-symbol classes.
  • Establish a standardized benchmark suite (including Cryptids and near-threshold machines) for evaluating AI theorem provers and program analyzers on Busy Beaver-style problems, with ground-truth formal certificates.
List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 9 posts and received 10 likes.

HackerNews

Reddit Logo Streamline Icon: https://streamlinehq.com

alphaXiv