Independent String Races Analysis
- Independent string races are a probabilistic framework where players monitor independent i.i.d. streams to detect target patterns.
- The analysis employs generating functions, border-polynomial methods, and Hadamard products to derive exact waiting times and win probabilities.
- Under bias, paradoxical effects such as mean waiting time reversals and non-transitive win cycles emerge, challenging typical stochastic dominance.
Independent string races are a probabilistic framework in which each of two or more players observes their own independent stream of i.i.d. trials over a finite alphabet, seeking the first occurrence of a designated target string. The fundamental question is to compute for two given patterns the probability that one appears before the other, under various conditions (such as fair or biased sources). This setting arises as a natural variant and generalization of non-transitive phenomena in pattern matching, including classical problems like Penney’s Ante, but with independence rather than shared randomness, leading to profound regularities and paradoxes in waiting-time behaviors and win-odds (Riis et al., 23 Jan 2026).
1. Formal Model and Problem Definition
Let denote a finite alphabet of size , with each symbol occurring independently at each time with probability . Each player receives an infinite, independent sequence of i.i.d. symbols. Fix two target strings, and . Define for each player the stopping time as the first time an observed length- block matches exactly. The contest is to determine, for each pair, the win probability:
0
where ties are assigned to either player with equal probability.
This setup contrasts with the “shared-stream” or “Penney’s game” scenario, where non-transitivity arises from dependent observations. Here, independence deeply influences the possible ordinal relations among patterns.
2. Waiting-Time Generating Functions and Marginals
The marginal law of 1 for a string 2 is captured through the border-polynomial apparatus. A border of 3 is any 4 where the prefix of length 5 matches the suffix of length 6. The set of all such borders, 7, determines the border polynomial:
8
The probability generating function (pgf) for the stopping time 9 is:
0
with a continued-fraction formula:
1
where
2
and 3. The mean waiting time is given by:
4
In the special case of a fair source (5 for all 6), this specializes to:
7
3. Head-to-Head Odds and the Hadamard Product Method
The independence of Alice’s and Bob’s streams enables the full factorization of joint events, and the analysis of head-to-head odds relies crucially on generating functions and Hadamard products. For string 8, define:
- 9, with 0
- 1
- 2
- 3
The Hadamard (termwise) product of two series, 4, is used to combine occurrence probabilities at each 5. Then
- 6
- 7
Thus, 8 is expressible as a combination of Hadamard products of the individual pattern pgfs and their tails, all reducible to closed-form rational functions in 9 and the bias parameters.
4. Stochastic Dominance and Total Pre-Order for Fair Dice
Comparisons between patterns are formalized through stochastic dominance: 0 if 1 for all 2. The crucial result for the fair-source case (3) is:
- The following are equivalent for any 4:
- Equality holds if and only if 8 and 9 have identical border sets and thus identical stopping time distributions.
This result implies that, under fairness, stochastic dominance yields a total preorder, with the ordering completely determined by the sum of border lengths (in base 0), which in turn equals the mean waiting time. The difference-factorization lemma,
1
shows that the sign (and hence order) is lexicographically determined by border polynomials.
5. Breakdown Under Bias: Incomparability and Non-Transitivity
For biased binary sources (2), the total preorder property fails. Explicitly:
- Total comparability under stochastic dominance, over all binary patterns, holds iff 3.
- For 4, there exist patterns (e.g., 5, 6 for large 7) where neither 8 stochastically dominates 9 nor vice versa, though their mean waiting times are still ordered.
- The lack of total comparability means that expectation does not always predict win probability orderings, and intransitivities may arise.
Bias thus fundamentally disrupts transitive and monotonic relationships observed in the fair setting.
6. Bias-Driven Phenomena: Mean-Reversal and Non-Transitive Cycles
With 0, two principal paradoxes manifest:
- Reversal between mean waiting time and win probability: For given patterns, the pattern with longer mean waiting time can nevertheless win more often head-to-head. For example, for 1 and 2, the unique crossover for win probability occurs at 3, but the means cross at 4, yielding an interval where 5 but 6.
- Existence of non-transitive cycles: There exist triples 7 and a fixed bias 8 such that 9, 0, and 1. Explicit examples include:
- For unequal biases: 2, 3; 4, 5; 6, 7.
- For equal biases and different lengths: 8, 9, 0 form a 3-cycle for 1.
- Extension to 2 (three-sided dice): Patterns 3, 4, 5 and biases in an open region of the simplex.
Comprehensive computational classification up to length 6 for binary strings under common bias finds sixteen distinct non-transitive families and two-pattern reversals exhaust the open 7-interval except near 8.
7. Implications and Classification of Fairness Dichotomy
The fundamental insight is that fair sources (coins or 9-sided dice with uniform probabilities) are exceptional. For these, mean waiting times totally order all strings by stochastic dominance, and all independent head-to-head races are transitive and expectation-ordered. Any departure from fairness—however slight—allows reversals between orderings by mean and by win probability, and admits non-transitive cycles even for short patterns.
This dichotomy precisely characterizes the interface between regular, predictable races and the array of paradoxes familiar in the study of runs and pattern waiting times. The combinatorial and analytic frameworks developed, particularly the border-polynomial and Hadamard-product calculus, provide exact rational expressions for all relevant quantities in independent string races, enabling both rigorous theorems and exhaustive computational classifications (Riis et al., 23 Jan 2026).