Dice Question Streamline Icon: https://streamlinehq.com

Avoid extreme conservativeness without sample splitting when d = o(n^{2/3})

Establish that, when the initial estimator of the parameter (\hat{\theta}_1) and the estimator of the curvature/Hessian (\hat{V}_G) are computed from the same data rather than from independent splits, the studentized and bias-corrected universal inference procedure for the Kullback–Leibler projection parameter \theta_G avoids extreme conservativeness—i.e., its miscoverage probability at a fixed nominal level does not converge to zero—as the sample size n grows, provided the parameter dimension satisfies d = o(n^{2/3}). In particular, prove such a result under the regularity framework of Section 2.2, potentially by verifying the second-order bias requirement n^{1/2}||\hat{\theta}_1-\theta_G||_{I_G} || I_G^{-1/2}(\hat{V}_G - V_G) I_G^{-1/2} ||_{op} = o_P(1) or an analogous condition sufficient to preclude extreme conservativeness.

Information Square Streamline Icon: https://streamlinehq.com

Background

The proposed studentized and bias-corrected universal inference constructs confidence sets for the KL-projection parameter by estimating an initial parameter \hat{\theta}_1 and the curvature matrix V_G on independent data splits. This independence enables a univariate projection in the bias condition (Assumption (A7)), easing the requirements needed for asymptotic exactness and avoiding extreme conservativeness.

When \hat{\theta}1 and \hat{V}_G are computed from the same data, the required condition strengthens to n{1/2}||\hat{\theta}_1-\theta_G||{I_G} || I_G{-1/2}(\hat{V}_G - V_G) I_G{-1/2} ||_{op} = o_P(1), which involves estimating a d×d matrix in operator norm and typically necessitates growth restrictions on d. Drawing parallels to related high-dimensional inference results, the authors conjecture that d = o(n{2/3}) should suffice to avoid extreme conservativeness in this same-sample setting.

A rigorous proof would clarify the precise dimensional growth regime under which the method remains non-degenerate (coverage not tending to 1) without sample splitting, and would guide practical implementations in high dimensions.

References

Mirroring the results in \citet{chang2023inference}, we conjecture that extreme conservativeness can be avoided when $d = o(n{2/3})$.

On the Precise Asymptotics of Universal Inference (2503.14717 - Takatsu, 18 Mar 2025) in Remark (On the estimation of V_G), Section 2.2: Studentized and Bias-corrected Universal Inference