Papers
Topics
Authors
Recent
2000 character limit reached

Structure-Agnostic Estimators

Updated 22 December 2025
  • Structure-agnostic estimators are inference methods that debias black-box nuisance models without assuming smoothness, sparsity, or specific parametric forms.
  • They leverage diverse machine learning tools and cross-fitting techniques to control bias and achieve minimax optimal convergence rates across various estimation tasks.
  • The approach applies broadly—from causal inference and treatment effect estimation to signal processing and cosmology—offering robust, flexible estimation in complex settings.

A structure-agnostic estimator is an inference procedure designed to estimate a functional of interest from data, leveraging black-box estimates of nuisance components while imposing no structural assumptions—such as smoothness, sparsity, or parametric identifiability—on those nuisance models. This methodology emerged from research into the limits of nonparametric functional estimation, particularly in causal inference, where it is often infeasible or undesirable to assume the underlying functional forms (e.g., for regression or propensity score models) reside within specified parametric or smoothness classes. Structure-agnostic estimators treat the rate at which such nuisance functions can be learned as given, and then debias these black-box learners to attain optimal convergence rates for the primary parameter, as quantified by minimax theory. The approach admits a wide diversity of base learners—including random forests, neural networks, boosting, and other modern machine learning tools—so long as their mean-squared errors in L2L^2 can be bounded or empirically estimated.

1. Formal Definition and Foundational Principles

Consider an i.i.d. sample O1,…,OnO_1,\ldots,O_n from an unknown distribution P0P_0 and a target parameter (functional) τ=χ(P0)=T(η(P0))\tau = \chi(P_0) = T(\eta(P_0)), where η(P0)\eta(P_0) denotes one or more nuisance functions (e.g., regression, density, propensity score) derived from P0P_0. Structure-agnostic estimators are defined as estimators τ^\hat\tau that use plug-in nuisance estimators η^\hat\eta satisfying only a mean-squared error guarantee

∥η^j−ηj(P0)∥L2(P0,Z),2≤Δn,j,j=1,2,…,\|\hat\eta_j - \eta_j(P_0)\|_{L^2(P_{0,Z}),2} \leq \Delta_{n,j},\qquad j = 1,2,\ldots,

with no further smoothness, sparsity, or structural constraint on the function class (Jin et al., 19 Dec 2025, Balakrishnan et al., 2023).

In this framework, minimax lower bounds characterize the best achievable estimation error using only these empirical L2L^2-rates for the nuisances. For example, in average treatment effect (ATE) estimation with regression and propensity score nuisances g0g_0 and m0m_0, if black-box oracles g^\hat g, m^\hat m satisfy L2L^2-error rates en,fne_n, f_n, the minimax risk is lower bounded by Ω(enfn+1/n)\Omega(e_n f_n + 1/n) (Jin et al., 22 Feb 2024, Jin et al., 19 Dec 2025).

This formulation fundamentally departs from classical minimax theory, which requires a distributional or functional structure (such as Hölder or Sobolev smoothness) (Jin et al., 19 Dec 2025, Balakrishnan et al., 2023). Structure-agnostic estimators thus provide a more universal—and robust—strategy for functional estimation when such structure is unknown, potentially misspecified, or inapplicable.

2. Construction of Structure-Agnostic Estimators

Most modern structure-agnostic estimators are based on first-order debiasing schemes, typically in one of two forms:

a. Doubly Robust and One-Step Estimators

For functionals with orthogonal scores (mixed-bias, or affine-score regime), first-order correction is performed via sample splitting or cross-fitting: ψ^1st=ψ(η^)+1n∑i=1nφ(Zi;η^),\hat\psi_{\text{1st}} = \psi(\hat\eta) + \frac{1}{n}\sum_{i=1}^n \varphi(Z_i; \hat\eta), where φ(z;η^)\varphi(z;\hat\eta) is the influence function at plug-in value η^\hat\eta. Plug-in estimators are debiased using the influence function evaluated on a separate sample fold, ensuring independence and controlling higher-order bias (Balakrishnan et al., 2023, Jin et al., 19 Dec 2025).

For ATE, the canonical doubly robust estimator (also called augmented inverse probability weighting, AIPW) is: θ^DR=1n∑i=1n[g^(1,Xi)−g^(0,Xi)+Di−m^(Xi)m^(Xi)(1−m^(Xi))(Yi−g^(Di,Xi))],\hat\theta^{\mathrm{DR}} = \frac1n\sum_{i=1}^n \left[ \hat g(1,X_i) - \hat g(0,X_i) + \frac{D_i - \hat m(X_i)}{\hat m(X_i)(1-\hat m(X_i))} \left( Y_i - \hat g(D_i, X_i) \right) \right], which is minimax-optimal under agnostic L2L^2-oracle rates for the nuisances (Jin et al., 22 Feb 2024, Jin et al., 19 Dec 2025).

b. Higher-Order Robust Estimators (ACE Procedures)

In certain partially linear models with non-Gaussian treatment noise η\eta, higher-order orthogonality (ACE estimators) can reduce bias rates beyond those achievable by standard DML/AIPW. Such estimators exploit cumulant-based moment functions to cancel higher-order Taylor expansions of the bias, attaining error rates O(ε1rε2+n−1/2)O(\varepsilon_1^r \varepsilon_2 + n^{-1/2}) where ε1,ε2\varepsilon_1, \varepsilon_2 are L2L^2-errors of the nuisances, and rr is determined by the non-Gaussianity of the noise (Jin et al., 3 Jul 2025). For Gaussian or binary treatments, standard DML/AIPW remains minimax-optimal.

3. Optimality Theory and Minimax Lower Bounds

Sharp minimax lower bounds for structure-agnostic estimators have been extensively developed:

  • In settings where the score function is affine in the main nuisance (mixed-bias regime), no estimator can outperform the product of the nuisance L2L^2-rates plus the sampling variance (e.g., enfn+1/ne_n f_n + 1/n for ATE).
  • In general (non-affine) regimes, an additional quadratic term appears (e.g., enfn+en2+1/ne_n f_n + e_n^2 + 1/n).
  • The upper bounds attained by first-order debiased estimators match the minimax lower bounds up to constants in both regimes (Jin et al., 19 Dec 2025, Jin et al., 22 Feb 2024, Balakrishnan et al., 2023).

The central implication is that, absent any further structural constraint, the price of agnostic inference is exact: product rates for orthogonalizable functionals, and extra quadratic rates for more general settings.

Regime Lower Bound on Error Attained by DML/AIPW
Affine (Mixed-bias) O(Δn,γΔn,α+n−1/2)O(\Delta_{n,\gamma} \Delta_{n,\alpha} + n^{-1/2}) Yes
Non-affine (Curved score) O(Δn,γΔn,α+Δn,γ2+n−1/2)O(\Delta_{n,\gamma} \Delta_{n,\alpha} + \Delta_{n,\gamma}^2 + n^{-1/2}) Yes, with extra term

These results confirm that no estimator using only black-box L2L^2-rate information can uniformly improve on DML/AIPW without imposing further structure (Jin et al., 19 Dec 2025, Jin et al., 22 Feb 2024).

4. Extensions and Methodological Generality

Structure-agnostic estimators have broad applicability:

  • General Functionals: The framework applies not only to ATE/ATT but to a wide class of functionals T(η(P0))T(\eta(P_0)), including density integrals, expected conditional covariances, and quadratic functionals (Jin et al., 19 Dec 2025, Balakrishnan et al., 2023, McClean et al., 22 Mar 2024).
  • Black-Box Learners: Any regression/classification method that admits an L2L^2 error guarantee (random forests, neural nets, boosting, Lasso, SuperLearner, etc.) can be used as a plug-in oracle. The debiasing step is independent of the learning method (Jin et al., 22 Feb 2024).
  • Cross-Fitting and Double Cross-Fitting: Cross-fitting is essential to avoid empirical process bias, and double cross-fitting further factors the bias for expected conditional covariance estimation, yielding sharper error decompositions (McClean et al., 22 Mar 2024).
  • Robust Structure-Blind Estimation in Signal Processing: In Gaussian signal recovery, structure-blind estimators that impose only a minimal constraint in the Fourier domain can adapt to unknown shift-invariant subspaces, achieving oracle inequalities relative to the best (unknown) linear estimator (Ostrovsky et al., 2016).

5. Trade-offs: Structure-Agnostic vs. Structure-Aware Methods

When nuisance functions belong to known smoothness classes (e.g., Hölder), higher-order debiasing can in principle outperform first-order structure-agnostic estimators. For example, when the sum of smoothness indices exceeds the critical threshold, root-nn rates can be attained; otherwise, only slower nonparametric rates are achievable with higher-order corrections (Balakrishnan et al., 2023, Bonvini et al., 14 May 2024, McClean et al., 22 Mar 2024).

However, these structure-aware estimators depend critically on prior knowledge or valid specification of function class, and risk misspecification bias if the true nuisances fall outside these classes. The structure-agnostic approach foregoes such risk in exchange for optimality guarantees under minimal assumptions (Jin et al., 19 Dec 2025, Balakrishnan et al., 2023). Hybrid settings, in which only some nuisance components are assumed smooth, admit refined rates interpolating between agnostic and classical minimax predictions (Bonvini et al., 14 May 2024).

6. Practical Guidance and Applications

Key recommendations and implications for practice include:

  • When to Apply: Use structure-agnostic estimators when nuisance structures are unknown, when misspecification risk is high, or when seeking maximal flexibility with black-box learners (Jin et al., 19 Dec 2025, Jin et al., 22 Feb 2024).
  • Performance Guarantees: Provided L2L^2-oracle rates can be bounded or estimated, first-order debiasing yields minimax-optimal estimation and inference for the target parameter, even if nuisance rates are slow or vary across folds (Jin et al., 22 Feb 2024, Balakrishnan et al., 2023, McClean et al., 22 Mar 2024).
  • Choice of Method: For ATE/ATT and other functionals with orthogonal (affine) scores, prefer double robust, cross-fitted estimators. For partially linear models with non-Gaussian noise, consider ACE or higher-order cumulant-based estimators if higher moments can be estimated and if noise is independent (Jin et al., 3 Jul 2025).
  • Smoothness-Aware Enhancements: Only pursue higher-order or undersmoothing modifications when there is well-justified smoothness knowledge or credible density regularity (McClean et al., 22 Mar 2024, Bonvini et al., 14 May 2024).
  • Robust Structure-Agnostic Estimation in Computer Vision: Algorithms such as that of Yang & Meer implement structure-agnostic estimation by adaptively linearizing heterogeneous objective functions (e.g., line, ellipse, or fundamental-matrix fitting) and estimating per-structure scales, eliminating hand-tuned thresholds and ranking structures by their inlier support and estimated noise (Yang et al., 2016).
  • Optimal Power Spectrum and Bispectrum Estimation in Cosmology: PolyBin3D implements optimal, unbiased, and structure-agnostic estimators for power spectra and bispectra under arbitrary masks and weightings, leveraging FFT and stochastic trace estimation for efficiency (Philcox et al., 10 Apr 2024).
  • Hybrid Agnosticism + Smoothness Models: New frameworks analyze "hybrid" classes where agnosticism is combined with smoothness constraints only on certain functionals of the filtered regressors, enabling improved rates without global structural imposition (Bonvini et al., 14 May 2024).

These developments demonstrate the breadth of applications for structure-agnostic methodology, spanning causal inference, robustness in vision, optimal cosmological summary statistics, and high-dimensional signal recovery. The unifying principle remains minimax optimality relative to black-box L2L^2 rates, with adaptation to problem-specific extensions as domain knowledge allows.

Whiteboard

Topic to Video (Beta)

Follow Topic

Get notified by email when new papers are published related to Structure-Agnostic Estimators.