Automated Efficient Estimation using Monte Carlo Efficient Influence Functions (2403.00158v2)
Abstract: Many practical problems involve estimating low dimensional statistical quantities with high-dimensional models and datasets. Several approaches address these estimation tasks based on the theory of influence functions, such as debiased/double ML or targeted minimum loss estimation. This paper introduces \textit{Monte Carlo Efficient Influence Functions} (MC-EIF), a fully automated technique for approximating efficient influence functions that integrates seamlessly with existing differentiable probabilistic programming systems. MC-EIF automates efficient statistical estimation for a broad class of models and target functionals that would previously require rigorous custom analysis. We prove that MC-EIF is consistent, and that estimators using MC-EIF achieve optimal $\sqrt{N}$ convergence rates. We show empirically that estimators using MC-EIF are at parity with estimators using analytic EIFs. Finally, we demonstrate a novel capstone example using MC-EIF for optimal portfolio selection.
- Covariance matrix estimation under total positivity for portfolio selection. Journal of Financial Econometrics, 20(2):367–389, 09 2020.
- If influence functions are the answer, then what is the question? Advances in Neural Information Processing Systems, 35:17953–17967, 2022.
- The fundamental limits of structure-agnostic functional estimation. arXiv preprint arXiv:2305.04116, 2023.
- Two-stage tmle to reduce bias and improve efficiency in cluster randomized trials. Biostatistics, 24(2):502–517, 2023.
- Automatic differentiation in machine learning: a survey. Journal of Marchine Learning Research, 18:1–43, 2018.
- Estimating integrated squared density derivatives: Sharp best order of convergence estimates. Sankhyā: The Indian Journal of Statistics, 50(3):381–393, 1988.
- Efficient and adaptive estimation for semiparametric models, volume 4. Springer, 1993.
- Pyro: Deep universal probabilistic programming. The Journal of Machine Learning Research, 20(1):973–978, 2019.
- Toward computerized efficient estimation in infinite-dimensional models. Journal of the American Statistical Association, 114(527):1174–1190, 2019.
- Chang, N.-C. Double/debiased machine learning for difference-in-differences models. The Econometrics Journal, 23(2):177–191, 2020.
- Double/debiased machine learning for treatment and structural parameters. The Econometrics Journal, 21(1):C1–C68, 01 2018. ISSN 1368-4221.
- Automatic debiased machine learning via neural nets for generalized linear regression. arXiv preprint arXiv:2104.14737, 2021.
- Riesznet and forestriesz: Automatic debiased machine learning with neural nets and random forests. In International Conference on Machine Learning, pp. 3901–3914. PMLR, 2022a.
- Automatic debiased machine learning of causal and structural effects. Econometrica, 90(3):967–1027, 2022b.
- Automatic debiased machine learning of causal and structural effects. Econometrica, 90(3):967–1027, 2022c.
- Kernel debiased plug-in estimation. arXiv preprint arXiv:2306.08598, 2023.
- A new algorithm for estimating the effective dimension-reduction subspace. Journal of Machine Learning Research, 9(53):1647–1678, 2008.
- A cross-validated targeted maximum likelihood estimator for data-adaptive experiment selection applied to the augmentation of rct control arms with external data. arXiv preprint arXiv:2210.05802, 2022.
- Knowledge distillation as semiparametric inference. arXiv e-prints, pp. arXiv–2104, 2021.
- Causal mediation analysis with double machine learning. The Econometrics Journal, 25(2):277–300, 2022.
- On the mathematical foundations of theoretical statistics. Philosophical Transactions of the Royal Society of London. Series A, Containing Papers of a Mathematical or Physical Character, 222(594-604):309–368, 1922.
- Orthogonal statistical learning. The Annals of Statistics, 51(3):879 – 908, 2023.
- Deductive derivation and turing-computerization of semiparametric efficient estimation. Biometrics, 71(4):867–874, 2015.
- A swiss army infinitesimal jackknife. In The 22nd International Conference on Artificial Intelligence and Statistics, pp. 1139–1147. PMLR, 2019.
- Statistical Learning with Sparsity: The Lasso and Generalizations. Chapman & Hall/CRC, 2015.
- The efficient market inefficiency of capitalization-weighted stock portfolios. Journal of Portfolio Management, 17:35–40, 1991.
- Demystifying statistical learning based on efficient influence functions. The American Statistician, 76(3):292–304, 2022.
- The influence function of semiparametric estimators. Quantitative Economics, 13(1):29–61, 2022.
- Risk reduction in large portfolios: Why imposing the wrong constraints helps. The Journal of Finance, 58(4):1651–1683, 2003.
- Empirical gateaux derivatives for causal inference. Advances in Neural Information Processing Systems, 35:8512–8525, 2022.
- Data-Driven Influence Functions for Optimization-Based Causal Inference, 2023.
- Estimating identifiable causal effects through double machine learning. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pp. 12113–12122, 2021a.
- Double machine learning density estimation for local treatment effects with instruments. Advances in Neural Information Processing Systems, 34:21821–21833, 2021b.
- Nonparametric von mises estimators for entropies, divergences and mutual informations. Advances in Neural Information Processing Systems, 28, 2015.
- On robust regression with high-dimensional predictors. Proceedings of the National Academy of Sciences, 110(36):14557–14562, 2013.
- Kennedy, E. H. Semiparametric Theory and Empirical Processes in Causal Inference, pp. 141–167. Springer International Publishing, 2016.
- Kennedy, E. H. Semiparametric doubly robust targeted double machine learning: a review. arXiv preprint arXiv:2203.06469, 2022.
- Auto-encoding variational Bayes. In International Conference on Learning Representations, 2014.
- Understanding black-box predictions via influence functions. In International conference on machine learning, pp. 1885–1894. PMLR, 2017.
- Adev: Sound automatic differentiation of expected values of probabilistic programs. Proceedings of the ACM on Programming Languages, 7(POPL):121–153, 2023.
- Targeted learning on variable importance measure for heterogeneous treatment effect. arXiv preprint arXiv:2309.13324, 2023.
- Markowitz, H. Portfolio selection. The Journal of Finance, 7(1):77–91, 1952.
- Automatic differentiation in pytorch, 2017.
- Pearlmutter, B. A. Fast exact multiplication by the hessian. Neural Computation, 6(1):147–160, 1994.
- On nesting monte carlo estimators. In International Conference on Machine Learning, pp. 4267–4276. PMLR, 2018.
- Rao, C. R. Information and the accuracy attainable in the estimation of statistical parameters. Bulletin of the Calcutta Mathematical Society, 37(3):81–91, 1945.
- Higher order influence functions and minimax estimation of nonlinear functionals. In Probability and statistics: essays in honor of David A. Freedman, volume 2, pp. 335–422. Institute of Mathematical Statistics, 2008.
- Targeted maximum likelihood estimation for causal inference in survival and competing risks analysis. Lifetime Data Analysis, 30(1):4–33, 2024.
- Continuous-time targeted minimum loss-based estimation of intervention-specific mean outcomes. The Annals of Statistics, 50(5):2469–2491, 2022.
- Estimation of time-specific intervention effects on continuously distributed time-to-event outcomes by targeted maximum likelihood estimation. Biometrics, 79(4):3038–3049, 2023.
- Gradient estimation using stochastic computation graphs. In Neural Information Processing Systems, 2015.
- Optimal randomized multilevel monte carlo for repeatedly nested expectations. arXiv preprint arXiv:2301.04095, 2023.
- Tsiatis, A. A. Semiparametric theory and missing data. Springer, 2006.
- Targeted maximum likelihood learning. The international journal of biostatistics, 2(1), 2006.
- van der Vaart, A. Higher order tangent spaces and influence functions. Statistical Science, pp. 679–686, 2014.
- van der Vaart, A. W. Asymptotic Statistics. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, 1998.
- A free lunch with influence functions? improving neural network estimates with concepts from semiparametric statistics. arXiv preprint arXiv:2202.09096, 2022.
- Wainwright, M. M. J. High-dimensional statistics : a non-asymptotic viewpoint. Cambridge University Press, 2019.
- Exact gaussian processes on a million data points. Advances in neural information processing systems, 32, 2019.
- Targeted maximum likelihood based estimation for longitudinal mediation analysis. arXiv preprint arXiv:2304.04904, 2023.
- Efficient targeted learning of heterogeneous treatment effects for multiple subgroups. Biometrics, 79(3):1934–1946, 2023.
- Why gradient clipping accelerates training: A theoretical justification for adaptivity. In International Conference on Learning Representations, 2020.
- Doubly robust self-training. arXiv preprint arXiv:2306.00265, 2023.
- Estimating covariance and precision matrices along subspaces. Electronic Journal of Statistics, 15(1):554 – 588, 2021.