Precise Deviations for the Ewens-Pitman Model
(2512.12323v1)
Published 13 Dec 2025 in math.PR and math.ST
Abstract: In this paper, we derive an integral representation for the distribution of the number of types $K_n$ in the Ewens-Pitman model. Based on this representation, we also establish precise large deviations and precise moderate deviations for $K_n$. After careful examination, we find that the rate function exhibits a second-order phase transition and the critical point is $α=\frac{1}{2}$.
Sponsor
Organize your preprints, BibTeX, and PDFs with Paperpile.
The paper establishes precise deviation asymptotics for Kₙ in both large and moderate regimes of the Ewens-Pitman model.
It derives an explicit contour integral representation and applies saddle point and steepest descent methods to obtain full polynomial prefactors in the asymptotic formulas.
A second-order phase transition at α = 1/2 is identified, offering new insights into the fluctuation behavior and influencing inference in population genetics and Bayesian nonparametrics.
Precise Deviations and Phase Transitions in the Ewens-Pitman Model
Introduction
The Ewens-Pitman sampling model generalizes the classical Ewens sampling formula by introducing two parameters (α,θ), capturing a broad spectrum of random partition phenomena with deep ties to population genetics and Bayesian nonparametrics. This model's statistical properties, particularly the behavior of the number of types Kn in a sample of size n, have been the subject of active investigation. Prior work established laws of large numbers, central limit theorems, and large/moderate deviations for Kn. The current paper advances this line by deriving precise deviations for Kn in both the large and moderate deviation regimes, clarifying the polynomial prefactor, and rigorously identifying a second-order phase transition in the rate function at α=1/2.
Integral Representation of Kn
A cornerstone of the analysis is the derivation of an explicit contour integral representation for the probability mass function of Kn:
with an explicit expression for the coefficients using the Gamma function and Sibuya distribution structure. This form enables a rigorous application of saddle point and steepest descent methods for asymptotic analysis.
Figure 1: Intermediate steepest descent contour used to deform the integral in the complex plane for extracting asymptotics of P(Kn=k).
Precise Local Deviation Asymptotics
The main result characterizes the asymptotics of P(Kn=k) for large n, in both large and moderate deviation regimes, including all polynomial prefactors. Let xk=k/n, h(z)=lnz−xkln(1−(1−z)α), and z(xk) the unique real solution to h′(z)=0 on (0,1). The principal theorem asserts
where I(xk)=h(z(xk)) is the LDP rate function. The explicit polynomial scaling is sharp, revealing the precise combinatorics underlying the Ewens-Pitman partitions.
In the moderate deviation regime k≍nαbn1−α, the probability decays as
with bn→∞, bn/n→0, and the rate function expanding as
nI(y(nbn)1−α)∼bn(1−α)α1−ααy1−α1
highlighting the polynomial–exponential dichotomy of tails.
Global Deviations and Summation Asymptotics
The paper extends the analysis to cumulative tails,
P(Kn≥xn)
by summing the local estimates over admissible k and invoking precise discrete Laplace-type estimates. The final asymptotics involve the derivative of the rate function and the increment of k, capturing discretization effects significant at the scale of deviations:
where {nx} denotes the fractional part of nx. This representation accurately tracks the probability mass in the tails of Kn with explicit combinatorial and analytic constants.
Phase Transition in the Rate Function
A critical theoretical insight is the identification of a second-order phase transition at α=1/2 in the rate function's curvature:
I′′(x)∼C(α)x1−α2α−1,x→0
where
C(α)={+∞,0<α<1/2=const,α=1/20,1/2<α<1
The sign and scaling of the sub-exponential prefactor in moderate deviations shifts across this transition, fundamentally altering the nature of the fluctuations. This constitutes a nontrivial, explicit phase transition in the precise deviation rates: the polynomial prefactors and effective speeds in moderate LDPs qualitatively change at this critical value.
Comparison with Pitman's α-Diversity
The analysis also involves refined asymptotics for the tail of the Pitman α-diversity random variable Sα,θ, the almost sure limit of Kn/nα, exploiting explicit integral and series representations for this variable's density and tail probabilities. The precise deviation rates for Kn and for the limiting fluctuation Sα,θ are shown to be asymptotically compatible, providing a bridge between finite-n and limiting behavior.
Practical and Theoretical Implications
These precise deviation results advance the toolkit available for statistical inference and hypothesis testing in contexts where the Ewens-Pitman model—particularly the distribution of the number of types—plays a central role. This includes nonparametric Bayesian methods, species sampling problems, and random partition structures in population genetics and machine learning. The explicit forms of the deviation probabilities, including subexponential corrections, enable sharper risk and error assessments for rare event analyses, informing confidence levels and credible intervals for sample diversity.
On the theoretical side, the identified phase transition provides a rare example of a non-analytic change in moderate deviation structure for an entire class of random combinatorial objects, likely bearing implications for related partition models, processes attached to Poisson-Dirichlet distributions, and the study of heavy-tailed phenomena in random discrete structures.
Conclusion
The work presents a mathematically rigorous and detailed treatment of precise deviations for Kn in the Ewens-Pitman model. It provides explicit integral representations, careful saddle point asymptotic expansions, full polynomial prefactors, and establishes a second-order phase transition in the deviation rate function at α=1/2. These results close a significant gap in our understanding of the Ewens-Pitman model and set a new standard for precise tail characterizations in complex random partition processes. The methods admit generalization to other partition-derived statistics and open avenues for deeper study of phase transitions in probabilistic combinatorics.
Reference: "Precise Deviations for the Ewens-Pitman Model" (2512.12323)