Orthogonal machine learning for conditional odds and risk ratios

Published 12 Apr 2026 in stat.ML, cs.LG, and stat.ME | (2604.10412v1)

Abstract: Conditional effects are commonly used measures for understanding how treatment effects vary across different groups, and are often used to target treatments/interventions to groups who benefit most. In this work we review existing methods and propose novel ones, focusing on the odds ratio (OR) and the risk ratio (RR). While estimation of the conditional average treatment effect (ATE) has been widely studied, estimators for the OR and RR lag behind, and cutting edge estimators such as those based on doubly robust transformations or orthogonal risk functions have not been generalized to these parameters. We propose such a generalization here, focusing on the DR-learner and the R-learner. We derive orthogonal risk functions for the OR and RR and show that the associated pseudo-outcomes satisfy second-order conditional-mean remainder properties analogous to the ATE case. We also evaluate estimators for the conditional ATE, OR, and RR in a comprehensive nonparametric Monte Carlo simulation study to compare them with common alternatives under hundreds of different data-generating distributions. Our numerical studies provide empirical guidance for choosing an estimator. For instance, they show that while parametric models are useful in very simple settings, the proposed nonparametric estimators significantly reduce bias and mean squared error in the more complex settings expected in the real world. We illustrate the methods in the analysis of physical activity and sleep trouble in U.S. adults using data from the National Health and Nutrition Examination Survey (NHANES). The results demonstrate that our estimators uncover substantial treatment effect heterogeneity that is obscured by traditional regression approaches and lead to improved treatment decision rules, highlighting the importance of data-adaptive methods for advancing precision health research.

Abstract PDF Upgrade to Chat

Authors (2)

Summary

The paper introduces EIF-based DR- and R-learner estimators for conditional odds and risk ratios, ensuring robustness against nuisance misspecification.
It demonstrates superior performance over traditional logistic and meta-learner approaches in high-complexity, high-dimensional settings with treatment effect heterogeneity.
The empirical and theoretical analyses support reliable individualized treatment rules in observational data and provide practical guidance for estimator selection.

Orthogonal Machine Learning for Conditional Odds and Risk Ratios: An Expert Synthesis

Introduction

This work presents a comprehensive theoretical and empirical development of orthogonal machine learning estimators for conditional odds ratio (OR) and risk ratio (RR) parameters, expanding the domain of doubly robust (DR) and orthogonal statistical learning beyond the conditional average treatment effect (CATE/ATE). The methodology leverages efficient influence function (EIF) theory to construct estimators robust to nuisance parameter misspecification, and the empirical evaluation rigorously compares the novel approaches with both parametric (logistic regression) and nonparametric meta-learners (SuperLearner T/S-learners) under a spectrum of simulated treatment effect heterogeneity and data complexity. The implications bear directly on the construction of reliable individualized treatment rules in high-dimensional, confounded observational data.

Methodological Framework

Target Parameters and Nuisance Structure

The estimands of interest are the conditional ATE, OR, and RR, defined as:

$ATE(X) = P(Y=1|T=1,X) - P(Y=1|T=0,X)$ ,
$OR(X) = \frac{P(Y=1|T=1,X)/(1-P(Y=1|T=1,X))}{P(Y=1|T=0,X)/(1-P(Y=1|T=0,X))}$ ,
$RR(X) = \frac{P(Y=1|T=1,X)}{P(Y=1|T=0,X)}$ ,

where $X$ denotes covariates, $T$ the binary treatment, and $Y$ the binary outcome. The setting assumes full confounder adjustment and nonparametric identification.

Classical plug-in estimators (logistic regression, vanilla SuperLearner T/S-learners) estimate $P(Y=1|T=t,X)$ , then compute contrasts. The DR- and R-learners adapt the pseudo-outcome (AIPW) machinery to form orthogonalized estimates.

Efficient Influence Function-Based DR/R-Learners

The main theoretical advance is the derivation and formal justification of DR-learners and corresponding R-learners directly targeting the conditional OR and RR (on both natural and logarithmic scales). For example, the DR-OR estimator regresses an EIF-based pseudo-outcome for $OR(X)$ on covariates; analogous constructions are made for RR, $\log$ OR, and $\log$ RR. The EIF-based pseudo-outcomes guarantee that their conditional mean matches the target estimand up to a second-order remainder in the error of the estimated nuisance functions $OR(X) = \frac{P(Y=1|T=1,X)/(1-P(Y=1|T=1,X))}{P(Y=1|T=0,X)/(1-P(Y=1|T=0,X))}$ 0 (outcome regressions and propensity), crucially yielding Neyman orthogonality under the Foster-Syrgkanis framework (2604.10412).

The R-learner variants are realized as propensity-overlap-weighted squared loss minimizations of the same pseudo-outcomes. The full theoretical results establish:

Conditional mean representations with only second-order bias in nuisance error,
Population risk minimization properties,
Orthogonal loss function structure leading to favorable excess risk bounds,
Complete proofs for all considered contrasts and both DR and R configurations.

Monte Carlo Evaluation

Simulation Design

The simulations rigorously stress-test the methods on a grid of data-generating processes (DGPs) varying interaction order among covariates, presence of treatment effect heterogeneity (HTE), and sample size. DGPs are sampled using Dirichlet and GP process priors to capably emulate realistic and complex confounding and heterogeneity structures.

Competing Methods

LR/S-Learner: Logistic regression, with and without covariate interactions.
T-Learner/SuperLearner: Meta-learners splitting or pooling on $OR(X) = \frac{P(Y=1|T=1,X)/(1-P(Y=1|T=1,X))}{P(Y=1|T=0,X)/(1-P(Y=1|T=0,X))}$ 1.
Plug-in DR: Two-stage pseudo-outcome regression (DR-P), standard within the CATE literature.
Novel DR- and R-learners: EIF-based for OR, RR, and their log variants.

Metrics

Estimation performance was assessed using integrated mean squared error (iMSE), integrated squared bias, and variance, with transformations to log scale for ratios to facilitate interpretable comparison.

Results

Figure 1: Conditional OR estimation performance (iMSE, iBias $OR(X) = \frac{P(Y=1|T=1,X)/(1-P(Y=1|T=1,X))}{P(Y=1|T=0,X)/(1-P(Y=1|T=0,X))}$ 2, iVariance) for interaction order 3, sample size 2000.

Across high-complexity DGPs with strong HTE and large $OR(X) = \frac{P(Y=1|T=1,X)/(1-P(Y=1|T=1,X))}{P(Y=1|T=0,X)/(1-P(Y=1|T=0,X))}$ 3, the proposed DR-LOR outperforms parametric approaches and nonparametric S/T-learners, yielding dominant iMSE and significant bias mitigation (see Figure 1). Parametric models display considerable bias, only being competitive when the underlying signal is simple or sample size is small (where variance dominates). For moderate and low HTE or small $OR(X) = \frac{P(Y=1|T=1,X)/(1-P(Y=1|T=1,X))}{P(Y=1|T=0,X)/(1-P(Y=1|T=0,X))}$ 4, SuperLearner and LR achieve comparable or better iMSE due to lower variance despite higher bias. DR-P strikes an intermediate bias-variance balance but fails to match direct-targeting DR-LOR and DR-LRR in the high-complexity regime.

The empirical reliability functions for iMSE demonstrate that the DR-learner versions achieve superior right-tail behavior in iMSE, a critical consideration for practitioners targeting worst-case DGPs.

Figure 2: Conditional OR estimation for interaction order 3, sample size 200 (variance-dominated regime).

R-learner variants invariably exhibit excessive variance, even in well-powered settings, due to aggressive weighting and should thus be eschewed in favor of DR-based or ensemble methods.

Coverage of All Configurations

The ranking is robust across all metrics and sample sizes, as detailed in supplementary tables. The debiased ratio learners (DR-LOR/DR-LRR) provide the most significant gains in nonparametric/high-variance settings, with the bias-variance characterization governing practical selection.

Real Data Application: NHANES Physical Activity Effect on Sleep

The methodology is illustrated using NHANES data to estimate the effect of physical activity on trouble sleeping, leveraging a rich set of demographic and clinical covariates.

Figure 3: Distribution of estimated conditional log-odds ratios and evaluation of treatment rules for reducing sleep trouble incidence using DR-LOR, R-LOR, SL-T, LR, and naive treat-all recommendation.

The empirical conditional effect heterogeneity captured by DR-LOR/R-LOR is broad, in contrast to the constant effect from logistic regression, and leads to improved individualized treatment rule evaluation (lower mean counterfactual risk by TMLE estimation). This validates the practical gain from EIF-driven approaches persisting even in complex, confounded real-world data.

Theoretical and Practical Implications

Theoretical advances include universality of orthogonal risk minimization for ratio estimands, with full Neyman orthogonality established for loss functions constructed from EIFs of OR and RR. Oracle inequalities and fourth-order convergence rates in nuisance estimation error confirm super-efficiency in the doubly robust regime.

Practically, the results provide direct recommendations:

Use DR-learner estimators (log-scale variants) for conditional OR/RR in settings of high complexity, sufficient sample size, and strong likelihood of model misspecification.
For simpler settings or small $OR(X) = \frac{P(Y=1|T=1,X)/(1-P(Y=1|T=1,X))}{P(Y=1|T=0,X)/(1-P(Y=1|T=0,X))}$ 5, default to LR or SuperLearner, optimizing for stability and computational tractability.
Avoid R-learner weighting for ratio contrasts.

Future Directions

The study is limited by the dimensionality supported in simulation and by a focus on binary outcomes/treatments. Extensions would generalize orthogonal pseudo-outcome derivations to continuous or time-to-event endpoints and scale the empirical exploration to larger covariate spaces. Integrating these developments with calibration and uncertainty quantification for individualized rules remains an important next step.

Conclusion

This paper rigorously establishes the feasibility and superiority of orthogonal machine learning for conditional odds and risk ratios, equipping practitioners with robust, low-bias estimators for advanced personalized effect discovery and decision rule development in high-dimensional, confounded data. The empirical results and theoretical guarantees together support wide adoption in causal inference and precision medicine pipelines.

Figure 4: Conditional OR estimation (iMSE, iBias $OR(X) = \frac{P(Y=1|T=1,X)/(1-P(Y=1|T=1,X))}{P(Y=1|T=0,X)/(1-P(Y=1|T=0,X))}$ 6, iVariance) for interaction order 1, sample size 2000, demonstrating increased estimator agreement as underlying structure simplifies.

Figure 5: Conditional RR estimation for interaction order 3, sample size 2000 (log-scale), illustrating congruence with findings from the OR analysis and gains of DR-LRR in high complexity.

Reference: "Orthogonal machine learning for conditional odds and risk ratios" (2604.10412)

Markdown Report Issue