Sigmoid-FTRL: Adaptive ATE Estimation
- Sigmoid-FTRL is an adaptive online design strategy that minimizes Neyman regret for average treatment effect estimation using AIPW estimators.
- It decomposes a nonconvex variance minimization problem into two convex learning tasks solved via FTRL updates over treatment probabilities and linear predictors.
- The method achieves asymptotic optimality and enables valid inference through consistently conservative variance estimation and adaptive ridge regression.
Sigmoid-FTRL is an adaptive online experimental design strategy for minimizing variance (Neyman regret) in the estimation of average treatment effects using Augmented Inverse Probability Weighting (AIPW) estimators, explicitly within the design-based potential outcomes framework where both outcomes and covariates are deterministic. The method unifies online convex optimization and adaptive Neyman allocation via a decomposition of a nonconvex variance-minimization problem into two convex online learning problems, efficiently addressed through Follow-the-Regularized-Leader (FTRL) updates over both treatment probabilities and linear predictors. Sigmoid-FTRL establishes asymptotic optimality, supports consistently conservative variance estimation, and enables construction of valid confidence intervals under broad regularity conditions (Chen et al., 25 Nov 2025).
1. Design-Based Setting and Problem Formulation
Consider observed units indexed by , each with a covariate vector bounded in norm () and deterministic potential outcomes . The goal is estimation of the average treatment effect (ATE),
using only randomized assignment. At each round , the procedure selects:
- Assignment probability as a function of history
- Linear predictor coefficients
Treatment is randomized, generating observed outcome . For each arm , online ridge-regression is used to fit
with an adaptive regularization parameter.
The adaptive AIPW estimator is
which is unbiased, and whose variance (and thus regret relative to the oracle design) admits closed-form analysis (Chen et al., 25 Nov 2025).
2. Neyman Regret and Oracle Design Benchmark
The “oracle” nonadaptive design fixes both linear predictors (by armwise OLS on all units) and probability , minimizing the expected variance:
where is the residual variance for potential outcomes under arm .
The benchmark variance is
The Neyman regret of any adaptive policy is
highlighting the additional variance incurred by adaptation relative to the nonadaptive oracle (Chen et al., 25 Nov 2025).
3. Algorithmic Formulation: Decomposition and Convexification
Direct minimization of Neyman regret is nonconvex in the triple . Sigmoid-FTRL circumvents this by decomposing the regret into two convex sequences:
- Probability Regret: For fixed predictors,
and
with convex on .
- Prediction Regret: For fixed ,
and
which is jointly convex in .
Lemma 3.3 asserts that
enabling separate convex-optimizable updates (Chen et al., 25 Nov 2025).
4. Sigmoid-FTRL Mechanism
The algorithm maintains parameter such that for a differentiable sigmoid with properties: monotonicity, , as well as specific convexity and derivative decay conditions. Examples include or .
For probability updates:
- Define .
- Use FTRL with regularizer :
where is an IPW-estimator of .
For linear predictor updates:
- For each arm, solve
Regularization is adaptive: , with .
Sequential steps (summarized):
| Step | Description | Complexity |
|---|---|---|
| Prediction update | Ridge regression by arm | per step |
| Probability update | 1D convex minimization in | |
| Residuals | Estimate armwise via IPW sums | (overall) |
No hyperparameter tuning is required; all regularization is data-adaptive (Chen et al., 25 Nov 2025).
5. Theoretical Guarantees
Convergence of Neyman Regret
Sigmoid-FTRL achieves
assuming bounded moments and well-conditioned Gram matrices after initial samples, for any as above. This matches the lower bound:
where no algorithm can improve on the rate under analogous regularity assumptions (demonstrated via a noisy two-armed construction) (Chen et al., 25 Nov 2025).
Distributional Asymptotics and Inference
Under non-superefficiency ():
facilitating Wald-type inference.
Consistent Conservative Variance Estimator
A variance bound estimator,
with defined by armwise IPW residuals, is consistent:
Both the variance estimator and the estimator enable construction of asymptotically accurate Wald-type confidence intervals,
with coverage tending to as (Chen et al., 25 Nov 2025).
6. Implementation Guidelines
- Sigmoid choice: Recommended includes either or .
- Adaptive regularization: Use , where tracks the maximum covariate norm to date; scaling with known is possible.
- Complexity: Total algorithmic run time is .
- No additional tuning required: There are no separate step-size or clipping parameters beyond the inherent adaptivity and regularization.
A plausible implication is that Sigmoid-FTRL offers a turn-key approach for optimal assignment in design-based adaptive experiments using AIPW estimators.
7. Broader Context and Implications
Sigmoid-FTRL extends the literature connecting Neyman allocation and online convex optimization (OCO) beyond the Horvitz-Thompson estimator, addressing nonconvexity via convex decomposition and FTRL dynamics. It establishes sharp upper and lower regret bounds and supports practical confidence interval construction for deterministic potential outcomes, which is especially relevant for design-based inference in randomized controlled trials and sequential experimentation. The method’s adaptivity and lack of tuning requirements suggest applicability in practical online experiment pipelines without additional complexity (Chen et al., 25 Nov 2025).