Empirical characterization of the rank-dependent pointwise error bound ε(r)
Develop a direct empirical characterization of the function ε(r) that bounds the pointwise log-domain difference |ln Qθ(wr) − ln P(wr)| between a trained language model’s marginal token probabilities Qθ and the true marginal token probabilities P as a function of token frequency rank r, in order to assess the validity of Assumption 2 used in the Textual Frequency Law proof.
References
These findings collectively support the hypothesis that ε(r) is small for high-frequency tokens and grows with rank, but a direct empirical characterisation of the pointwise bound remains an open problem.
— Adam's Law: Textual Frequency Law on Large Language Models
(2604.02176 - Lu et al., 2 Apr 2026) in Remark (Strength and character of Assumption 2), Section “Assumptions,” Appendix “Scope and Proof Strategy”