Four-digit Kaprekar dynamics in odd bases
Abstract: Start with four digits, arrange them in both descending and ascending order, subtract, and repeat. This simple process is known as the Kaprekar routine, famous in base ten for sending every nonconstant four-digit string to $6174$. We show that in every odd base $B>3$, the four-digit Kaprekar map has an unexpectedly rigid structure. After at most three iterations, every nonconstant orbit enters an explicit triangular region $\mathcal{T}_B$, and on this region the map is conjugate to projective doubling: [ {[r],[s]}\longmapsto {[2r],[2s]}. ] This gives a complete finite description of all nonconstant terminal cycles, including an explicit formula for their lengths and counts. In particular, the longest terminal cycle has length at most $(B-1)/2$, and equality can occur only when $B$ is prime. For primes $p>5$, equality occurs precisely when the least positive $m$ with $2m\equiv\pm1\pmod p$ is $m=(p-1)/2$. The results proved here were first formulated by Schwartz and Thakur. As a test case for AI-assisted formal mathematics, AxiomProver produced Lean/mathlib formalizations of these results.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Explain it Like I'm 14
Overview
This paper studies a simple number trick called the Kaprekar routine for 4 digits, but not just in base 10. It looks at what happens in every odd base (like base 7, 9, 11, …). The authors show that, after a few steps, this “sort the digits, subtract, and repeat” process follows a very clean rule: it turns into “double two numbers on a clock and ignore minus signs.” This lets them describe exactly which loops (cycles) can happen and how long they are.
Key questions
The paper asks:
- If you do the four-digit Kaprekar routine in an odd base B (B > 3), what patterns do you always end up in?
- Can we describe all the end behavior (the loops/cycles) in a simple, uniform way?
- How long can these loops be, and when do we get the longest ones?
How did they study it?
The routine is:
- Take four digits.
- Arrange them in descending order and in ascending order.
- Subtract the smaller from the bigger.
- Repeat.
Example in base 10: 7641 − 1467 = 6174, and 6174 stays fixed forever.
The clever idea in this paper is to stop tracking all four digits and instead track just two differences:
- d1 = biggest digit − smallest digit (outer difference)
- d2 = second-biggest digit − third-biggest digit (inner difference)
These two numbers (d1, d2) are enough to determine what happens next. The authors prove that after at most three steps, these two numbers always land in a nice “stable zone” where:
- d1 > d2 > 0, and both are odd.
Then they switch to even better coordinates:
- r = (d1 + d2)/2 (the half-sum),
- s = (d1 − d2)/2 (the half-difference), which are both positive integers once you’re in the stable zone.
Now comes the surprise: if you look at r and s using clock arithmetic (modulo B), and you treat a number and its negative as the same (that’s what “projective” means here—signs don’t matter), then one Kaprekar step simply doubles both r and s. In short:
- After relabeling, “sort-and-subtract” becomes “double both numbers,” with signs ignored, and order of the pair ignored.
This relationship is called a conjugacy: the original process is the same as “projective doubling” after a change of coordinates.
Main findings and why they matter
Here are the key results, in plain language:
- After at most three steps, every non-constant four-digit input enters the stable zone where the process is easy to describe.
- Inside that zone, one Kaprekar step = “double both numbers mod B, ignoring signs,” on an unordered pair. This makes the behavior very rigid and predictable.
- The authors can completely list all possible loops (cycles) and how many there are, for any odd base B > 3.
- The longest possible cycle length is at most (B − 1)/2.
- That maximum is only reached when:
- B is a prime number, and
- when doubling on the “clock of size B,” ignoring signs, takes exactly (B − 1)/2 steps to come back to where you started. (In math-speak: the smallest m with 2m ≡ ±1 (mod B) equals (B − 1)/2.)
- For primes p > 5 where the maximum is reached, the number of longest cycles is floor(( (p − 1)/2 − 1 ) / 2).
Why this matters:
- It turns a messy digit process (sorting and subtracting) into a clean arithmetic rule (doubling modulo B, with signs ignored). That’s a big simplification.
- It explains, across all odd bases, exactly what kinds of loops happen and how long they can be.
- It shows the behavior depends on simple number properties of the base, especially how powers of 2 behave modulo the base.
Short, concrete examples
- Base 7: The longest cycles have length 3 = (7 − 1)/2, and there is exactly one such cycle.
- Base 11: The longest cycles have length 5 = (11 − 1)/2, and there are exactly 2 of them.
- Base 9 (not prime): The longest cycle is only length 3 (shorter than (9 − 1)/2 = 4), because in base 9 the doubling rule returns early (23 ≡ −1 mod 9).
Think of doubling “around a circle” with B spots. If doubling hits every spot before coming back up to sign, you get the longest possible loop. If it comes back sooner, loops are shorter.
Implications and impact
- This work gives a complete and tidy picture of four-digit Kaprekar dynamics in every odd base B > 3.
- It connects a digit trick to core number theory: modular arithmetic and how 2 behaves modulo B.
- It provides exact counts and lengths of all terminal cycles, not just rough bounds.
- It shows a method: reduce a digit process to a small set of differences, then switch to the right coordinates to reveal a simple underlying rule.
Bonus note: The authors also verified their results using Lean (a proof assistant) and an AI tool called AxiomProver, treating this project as a test case for AI-assisted formal mathematics. This increases confidence in the correctness and shows how human insight and formal verification can work together.
Knowledge Gaps
Knowledge gaps, limitations, and open questions
The paper provides a complete structural description for four-digit Kaprekar dynamics in odd bases , but leaves several concrete avenues for further research:
- Even bases: Extend the structural model to even bases. Identify invariant regions and coordinates (analogous to ) that make one Kaprekar step conjugate to a simple endomorphism (e.g., multiplication by a constant) on an appropriate quotient of . Determine whether a bounded pre-period exists and whether an equally rigid conjugacy can hold in even bases (building on or sharpening the coset methods in prior work).
- Base : Provide a dedicated four-digit classification in base $3$ (where is empty) that parallels the odd-base theory. Is there a natural replacement for and for the projective doubling model in this special case?
- Other digit lengths: Generalize the “difference-coordinate” and projective-conjugacy approach beyond four digits. For $2k$ digits, is there a bounded transient and a conjugacy to a linear (or projective) map on -tuples of classes in ? Identify the appropriate invariants and the induced map (e.g., “multiplication by $2$” analogs, matrices, or multi-coordinate maps).
- Optimality of the 3-step transient: Determine whether the bound is sharp for each odd . Classify initial states that require exactly 1, 2, or 3 steps to enter , and characterize the worst-case pre-period per base.
- Time-to-cycle statistics: Beyond entry into , characterize the total transient length to reach a terminal cycle from a random initial four-digit string. Determine the distribution and maximal transient length as functions of .
- Composite bases and sharper bounds: Replace the bound with one involving the Carmichael function (the exponent of ). Since the projective order of $2$ divides , can one prove a uniform bound of the form and characterize equality cases for composite ?
- Structure at prime powers: For ( odd), give explicit formulas for (the projective order of $2$) and for in terms of and the lifting of to . Determine when attains the refined bounds and how cycle lengths distribute across .
- Multiplicativity and factorization dependence: Establish whether is multiplicative (or has a clean factorization) in and provide closed-form cycle-count formulas for special families (e.g., squarefree , semiprimes, or prime powers), refining the general divisor-sum formula using the Chinese Remainder Theorem.
- Invariant parameters on : Make explicit and exploit orbit invariants such as and (which are preserved under doubling for odd ) to stratify and count cycles, and to describe basins of attraction in digit space more directly.
- Density of bases achieving the upper bound: For primes , iff (equivalently, projective order and $2$ is a quadratic nonresidue). Quantify how often this occurs:
- Determine the asymptotic density of such primes (conditionally under Artin’s primitive root conjecture and unconditionally in special families).
- Describe infinite families and explicit lower bounds for the frequency of primes with maximal projective order.
- Cycle-length spectra in a fixed base: Characterize the full set of attainable terminal cycle lengths for a given composite in terms of ’s factorization and the orders of $2$ modulo its odd divisors. Provide simple necessary/sufficient conditions for the presence of particular lengths.
- Pre-image and basin sizes: For each terminal cycle (or each pair in ), count the number of four-digit strings mapping into its basin. Provide exact formulas for the number of digit strings per difference pair and per cycle, and study how these counts depend on .
- Uniformity of distribution on : After at most three steps, every orbit lies in and is in bijection with via . Is the induced distribution on (from uniformly random four-digit strings) uniform? If not, characterize the bias and its dependence on .
- Even-finer “local” dynamics: Describe the directed graph structure on (outside and inside ), including in-degree distributions, component structure, and how components merge into cycles under the conjugacy. Provide explicit classification of all pre-periodic tails.
- Leading-zero constraints: The analysis allows leading zeros. How do cycle counts and lengths change if one forbids leading zeros (i.e., requires )? Provide adjusted versions of , , and under this restriction.
- Algorithmic aspects: Develop and analyze linear-time algorithms (in ) for computing the terminal cycle of a given four-digit string via the projective model. Provide complexity guarantees and practical implementations for large .
- Extensions to related digit operations: Investigate whether the same projective-doubling mechanism (or analogous linear models) persists for variants (e.g., -digit Kaprekar routines with different ordering rules, or other “sort-and-subtract” digit dynamics). Quantify when the hidden arithmetic reduces to multiplication by a constant on a projective space.
Practical Applications
Overview
The paper uncovers a simple, exact structure for the four‑digit Kaprekar routine in every odd base : after at most three steps, the process on digit differences enters a stable region where each iteration is conjugate to multiplying two projective residues by $2$ (with signs identified). This yields explicit bounds and counts for terminal cycles via the projective order of $2$ modulo divisors of , and it comes with a fully verified Lean/mathlib formalization produced via an AI-assisted workflow.
Below are practical applications derived from these findings and methods. Each item includes sector tags, a short description, possible tools/products/workflows, and key assumptions/dependencies.
Immediate Applications
- Education: number theory and dynamical systems
- Use case: Classroom modules and interactive apps to teach modular arithmetic, multiplicative order, projective residues, cycle decompositions, and conjugacy in dynamical systems through the Kaprekar routine.
- Tools/products/workflows:
- Web visualizers that show the three-step “pre-period” and then the doubling dynamics on .
- Problem sets and coding labs guiding students to compute and predict cycle lengths/counts.
- Assumptions/dependencies:
- Base must be odd and to ensure the stable region is nonempty and the conjugacy holds.
- Computing multiplicative orders requires modular arithmetic; for classroom scales, is small.
- Software engineering: testing and verification of digit algorithms
- Use case: Property-based tests and oracles for base‑ digit operations (sorting, borrow/carry subtraction) and finite‑state transformations.
- Tools/products/workflows:
- Implement the difference-coordinate map and verify that after steps, outputs are in and then follow the projective‑doubling law; use this as an invariant for regression tests.
- Fuzzers that generate random four‑digit inputs, then check conformance with predicted cycle lengths from .
- Assumptions/dependencies:
- Odd base ; deterministic sorting/subtraction routines.
- Correct handling of modular arithmetic to compute and doubling mod .
- Academic research: finite dynamical systems exemplars
- Use case: A compact, fully solved model illustrating how to reduce a digit rule to a group action via conjugacy on a forward‑invariant subset; portable as a pattern for analyzing other finite systems.
- Tools/products/workflows:
- Research notes/lectures showing stepwise reduction: digit differences → stable region → projective residues → doubling.
- Comparative studies with other sorting‑then‑operating routines (e.g., digital root–type maps).
- Assumptions/dependencies:
- The method depends on identifying invariants and a forward‑invariant region; immediate transfer to other problems requires analogous structure.
- Formal methods: reproducible proofs and AI‑assisted formalization
- Use case: A ready testbed for benchmarking theorem provers and AI assistants on nontrivial but self‑contained math results (dynamics, modular arithmetic, and finite combinatorics).
- Tools/products/workflows:
- Use the public repo to set up Lean 4.28.0 environments that build the provided proof objects; integrate into CI for reproducible builds.
- Develop internal guidelines for crediting AI tools (as in the paper) and recording verification protocols.
- Assumptions/dependencies:
- Lean/mathlib version compatibility (Lean 4.28.0 as used in the paper).
- Availability of AxiomProver or similar tools, though verification can be done with Lean alone.
- Educational games and puzzles
- Use case: Design base‑ Kaprekar puzzles with guaranteed cycle lengths and counts, tailored by choosing with known properties (e.g., primes with specific order of $2$).
- Tools/products/workflows:
- Generators that output puzzles with specified difficulty by varying and targeting longest cycles.
- Leaderboards/challenges comparing pre-period lengths and convergence behavior across bases.
- Assumptions/dependencies:
- Non-decimal bases may need brief onboarding for players; use visualization and guided explanations.
- RNG and security pedagogy (cautionary example)
- Use case: Demonstrate why superficially “scrambling” digit routines can yield short cycles; analyze periods via to illustrate pitfalls in PRNG design.
- Tools/products/workflows:
- Classroom labs comparing Kaprekar-based sequences against proper PRNGs; show that periods are at most and often much smaller in composite bases.
- Assumptions/dependencies:
- Not suitable for secure applications; clearly label this as an anti‑pattern example.
- Quick calculators and libraries
- Use case: Fast computation of cycle lengths and counts for any odd base using the explicit formulas via and .
- Tools/products/workflows:
- Lightweight libraries (Python/Julia/Rust) that:
- Map inputs to , step to in ≤3 iterations, then track cycles via doubling on .
- Compute via the paper’s formula.
- Assumptions/dependencies:
- Efficient modular exponentiation; for large , factoring may be needed to compute precisely (see long‑term items for scaling).
Long-Term Applications
- Generalizations to industrial/state‑space reductions
- Sector: Software verification, model checking, cyber‑physical systems
- Use case: Export the “identify invariant + conjugate to simple action” methodology to simplify verification of complex finite‑state machines by mapping to smaller quotient dynamics.
- Tools/products/workflows:
- Develop reusable patterns in static analyzers/model checkers that attempt invariant discovery and search for conjugacy to canonical actions (e.g., linear maps on groups).
- Assumptions/dependencies:
- Success depends on problem structure; requires automated invariant discovery and equivalence detection techniques.
- Extension to other bases and digit lengths
- Sector: Academia (mathematics, theoretical CS)
- Use case: Extend the projective‑doubling conjugacy to even bases or to ‑digit Kaprekar maps; classify stable regions and cycle structures.
- Tools/products/workflows:
- Research software that experiments with different digit counts and bases, searching for stable regions and simple conjugate actions (possibly non‑doubling).
- Integration with formal proof assistants for verified classifications.
- Assumptions/dependencies:
- Even bases are known to be more intricate; success likely requires new ideas (cf. cited work by Kay and Downes‑Ward).
- Scalable computation of counts for large
- Sector: Software, HPC, computational number theory
- Use case: Compute for very large by accelerating the determination of over without full factorization.
- Tools/products/workflows:
- Utilize partial factorization, elliptic curve or Pollard–rho methods, and modular order algorithms; cache results over prime powers and use Chinese remaindering.
- Assumptions/dependencies:
- For worst‑case composite , computing orders may remain hard without factoring; exact counts may be infeasible at cryptographic sizes.
- Formalization pipelines across research groups
- Sector: Academia, research policy
- Use case: Establish repeatable, team‑scale workflows for AI‑assisted formalization: inputs (.tex + tasks), standardized environments (.environment), generated Lean files, and review protocols.
- Tools/products/workflows:
- Templates and CI scripts that orchestrate: problem statements → AI draft proofs → human curation → Lean verification → artifact archiving.
- Assumptions/dependencies:
- Community standards for crediting AI tools; stable theorem‑prover ecosystems; training and onboarding for researchers.
- Curriculum and outreach programs
- Sector: Education policy and institutions
- Use case: Embed Kaprekar dynamics as a capstone linking algebra, number theory, and computation; demonstrate proof-to-software reproducibility.
- Tools/products/workflows:
- Project‑based courses where students move from experimental discovery to formal proof verification and back to software/puzzle deployment.
- Assumptions/dependencies:
- Institutional support for blended math–CS curricula; access to infrastructure for interactive visualization and theorem proving.
- Benchmarking and safety in AI‑for‑Math
- Sector: AI/ML, research policy
- Use case: Build curated benchmark suites (like this paper) to evaluate AI assistants’ reliability on mid‑complexity, concept‑dense proofs; monitor hallucination, adherence, and verifiability.
- Tools/products/workflows:
- Versioned benchmark corpora with reference Lean proofs, metrics for proof validity, and protocols for attribution and human oversight.
- Assumptions/dependencies:
- Ongoing development of provers and AI agents; community governance on evaluation standards.
- Applied guidance for RNG and protocol design (negative patterns)
- Sector: Security engineering
- Use case: Establish checklists and automated detectors that flag digit‑based “scramblers” with provably bounded and predictable cycles (e.g., Kaprekar‑like), preventing misuse in security‑critical contexts.
- Tools/products/workflows:
- Static analyzers scanning codebases for known low‑period constructions; test harnesses that estimate periods and compare against theoretical thresholds.
- Assumptions/dependencies:
- Requires bridging from theory to practical pattern recognition; may need domain‑specific knowledge to avoid false positives.
- Content generation for edutainment platforms
- Sector: Educational technology, gaming
- Use case: Automated creation of puzzles/games that adaptively choose bases and parameters to produce desired cycle complexity and variety.
- Tools/products/workflows:
- Parameterized generators tuned by and ; difficulty scaling by base primality and order conditions (equality in only for primes with $2$ of projective order ).
- Assumptions/dependencies:
- Requires a library to compute (or approximate) orders efficiently for a catalog of bases; UX to teach non‑decimal bases.
Notes on Assumptions and Dependencies
- Mathematical scope:
- Core results apply to the four‑digit Kaprekar routine in odd bases ; the stable region and projective‑doubling conjugacy underpin most applications.
- Maximal cycle characterization requires prime and that $2$ have projective order . Base is a special non‑attaining case.
- Computational requirements:
- For exact cycle counts , one needs for odd divisors ; this may require factoring for large inputs.
- Security caveat:
- The transformation yields short and predictable cycles; it is not suitable for cryptographic purposes. It is best used as an instructive counterexample.
- Formal methods:
- Lean 4.28.0 and mathlib versions must match the provided environment; reproducibility follows if versions are pinned and CI builds are used.
Glossary
- AxiomProver: An AI-assisted formal mathematics system used to generate and verify formal proofs in Lean. "AxiomProver produced Lean/mathlib formalizations of these results."
- bijection: A one-to-one and onto mapping between two sets. "The map is a bijection."
- conjugate: In dynamical systems, two maps are conjugate if a bijection relabels states so that one map corresponds to the other. "the restricted map is conjugate to the doubling map"
- difference coordinates: The pair of integers capturing the outer and inner digit differences that determine the next Kaprekar iterate. "This passage to difference coordinates does not lose any eventual cycle lengths."
- doubling map: The transformation that multiplies each projective residue class by 2. "the doubling map "
- Euler’s totient function: The function counting integers in that are coprime to . "This group has order ,"
- finite dynamical systems: Systems with a finite state space and an iterated map defining evolution over time. "Here, in the context of finite dynamical systems, we explicate the term ``conjugate'' as follows."
- fixed point: A state that maps to itself under the iterated transformation. "the process is drawn to the constant $6174$, which is a fixed point since"
- forward-invariant: A subset that is mapped into itself by the transformation. "Then , as defined in \eqref{eq:stable-chamber}, is forward-invariant"
- gcd (greatest common divisor): The largest integer dividing two integers without remainder. "and let ."
- group quotient: A group formed by identifying elements according to a normal subgroup (here, modulo signs). "so is a group quotient rather than merely a set of projective residue classes."
- Kaprekar map: The function induced by the Kaprekar routine on the difference-coordinate state space. "Let be the four-digit Kaprekar map in difference coordinates."
- Kaprekar routine: The process of sorting a number’s digits in descending and ascending order, subtracting, and repeating. "This simple process is known as the Kaprekar routine,"
- lcm (least common multiple): The smallest positive integer divisible by each of two given integers. "If , then each choice of the two cycles contributes orbits of unordered pairs, each of length ."
- Lean/mathlib: The Lean theorem prover and its community math library used for formal verification. "AxiomProver produced Lean/mathlib formalizations of these results."
- orbit: The sequence of states obtained by iteratively applying the map starting from a given state. "After at most three iterations, every orbit enters our smaller stable region ."
- pre-periodic segment: Initial portion of an orbit before it enters a repeating cycle. "after a bounded pre-periodic segment, the Kaprekar map is just projective doubling."
- projective doubling: Doubling viewed on projective residue classes where each class is identified with its negative. "the map is conjugate to projective doubling:"
- projective order of 2: The least such that . "For odd , define ."
- projective residue classes: Residue classes modulo where each class is identified with its negative (i.e., and are the same). "we will pass to projective residue classes modulo , where each residue is identified with its negative."
- stable odd region: The subset where both differences are positive, unequal, and odd, and which is forward-invariant. "will lie in the stable odd region defined by"
- state space: The set of all admissible states on which the map acts. "Accordingly, we define a natural state space"
- terminal cycle: A periodic loop that an orbit eventually enters and remains in. "the number of terminal cycles of length is"
- unordered two-element subsets: Pairs of two distinct elements considered without order. "Finally, let denote the (unordered) two-element subsets of ."
- unit (mod n): An integer coprime to that has a multiplicative inverse modulo . "with a unit modulo ."
Collections
Sign up for free to add this paper to one or more collections.