Dice Question Streamline Icon: https://streamlinehq.com

Aldous’s conjecture on uniform nearest-neighbor coverage estimation

Prove that for any metric space (S, d) and any i.i.d. sample X1,…,Xn from a probability measure μ on S, with Un(r) = ⋃_{i=1}^n B(X_i, r) and Wn(r) the count of points whose leave-one-out nearest-neighbor distance is at most r, there exists a universal constant c such that E[ sup_{r ≥ 0} { Wn(r)/n − μ(Un(r)) }^2 ] ≤ c/n. Establishing this would give a uniform-in-r O(1/n) L2 bound for the nearest-neighbor coverage estimator without assumptions on (S, μ).

Information Square Streamline Icon: https://streamlinehq.com

Background

Theorem \ref{universaldthm} shows that for any fixed radius r, the leave-one-out estimator Wn(r)/n concentrates around μ(Un(r)) with mean squared error O(1/n), which implies O(1/√n) concentration in L2. However, this bound is pointwise in r and does not address uniform control over all radii.

Aldous conjectured a uniform-in-r improvement: an O(1/n) L2 bound for the supremum of the deviation over all r ≥ 0, independent of the underlying metric space and distribution. Such a result would provide a universal calibration tool for nearest-neighbor based testing of suspicious coincidences, with guarantees that do not depend on (S, μ).

References

Theorem \ref{universaldthm} gives $O(1/\sqrt{n})$ concentration. An unpublished example of Aldous shows this cannot be improved without further assumptions. He conjectured that $$ E\biggl(\sup_{r\ge 0}\biggl{ \frac{W_n(r)}{n} -\mu(U_n(r)) \biggr}2\biggr) \leq \frac{c}{n},\mbox{ for some universal }c. $$ This is an open problem.

Estimating the size of a set using cascading exclusion (2508.05901 - Chatterjee et al., 7 Aug 2025) in Remark following Theorem \ref{universaldthm}, Section “Testing coincidences” (subsection: A universal approximation theorem)