Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
158 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Geometry-Aware Instrumental Variable Regression (2405.11633v1)

Published 19 May 2024 in cs.LG and stat.ML

Abstract: Instrumental variable (IV) regression can be approached through its formulation in terms of conditional moment restrictions (CMR). Building on variants of the generalized method of moments, most CMR estimators are implicitly based on approximating the population data distribution via reweightings of the empirical sample. While for large sample sizes, in the independent identically distributed (IID) setting, reweightings can provide sufficient flexibility, they might fail to capture the relevant information in presence of corrupted data or data prone to adversarial attacks. To address these shortcomings, we propose the Sinkhorn Method of Moments, an optimal transport-based IV estimator that takes into account the geometry of the data manifold through data-derivative information. We provide a simple plug-and-play implementation of our method that performs on par with related estimators in standard settings but improves robustness against data corruption and adversarial attacks.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (59)
  1. Efficient estimation of models with conditional moment restrictions containing unknown functions. Econometrica, 71(6):1795–1843, 2003.
  2. Amemiya, T. The nonlinear two-stage least-squares estimator. Journal of Econometrics, 2(2):105–110, 1974.
  3. Mostly harmless econometrics. Princeton university press, 2008.
  4. Uniqueness of the fisher–rao metric on the space of smooth densities. Bulletin of the London Mathematical Society, 48(3):499–506, 2016.
  5. The variational method of moments. Journal of the Royal Statistical Society Series B: Statistical Methodology, 85(3):810–841, 2023.
  6. Deep generalized method of moments for instrumental variable analysis. Advances in neural information processing systems, 32, 2019.
  7. Reproducing kernel Hilbert spaces in probability and statistics. Springer Science & Business Media, 2011.
  8. Bierens, H. J. Consistent model specification tests. Journal of Econometrics, 20(1):105–134, 1982.
  9. Robust wasserstein profile inference and applications to machine learning. Journal of Applied Probability, 56(3):830–857, 2019.
  10. Generalization of gmm to a continuum of moment conditions. Econometric Theory, 16(6):797–834, 2000. ISSN 02664666, 14694360.
  11. Efficient estimation of general dynamic models with a continuum of moment conditions. Journal of econometrics, 140(2):529–573, 2007.
  12. Targeted backdoor attacks on deep learning systems using data poisoning. arXiv preprint arXiv:1712.05526, 2017.
  13. Cuturi, M. Sinkhorn distances: Lightspeed computation of optimal transport. Advances in neural information processing systems, 26, 2013.
  14. Training gans with optimism. arXiv preprint arXiv:1711.00141, 2017.
  15. Minimax estimation of conditional moment models. In Advances in Neural Information Processing Systems, volume 33, pp.  12248–12262. Curran Associates, Inc., 2020.
  16. Variance-based regularization with convex objectives. Advances in neural information processing systems, 30, 2017.
  17. Learning models with uniform performance via distributionally robust optimization, 2020.
  18. Statistics of robust optimization: A generalized empirical likelihood approach. Mathematics of Operations Research, 46(3):946–969, 2021.
  19. Large sample analysis of the median heuristic. arXiv preprint arXiv:1707.07269, 2017.
  20. Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572, 2014.
  21. A kernel statistical test of independence. Advances in neural information processing systems, 20, 2007.
  22. A kernel two-sample test. The Journal of Machine Learning Research, 13(1):723–773, 2012.
  23. Hall, A. Generalized method of moments. Wiley Online Library, 2004.
  24. Methods for estimating a conditional distribution function. Journal of the American Statistical association, 94(445):154–163, 1999.
  25. Hansen, L. P. Large sample properties of generalized method of moments estimators. Econometrica, 50(4):1029–1054, 1982. ISSN 00129682, 14680262.
  26. Finite-sample properties of some alternative gmm estimators. Journal of Business & Economic Statistics, 14(3):262–280, 1996. ISSN 07350015.
  27. Deep iv: A flexible approach for counterfactual prediction. In International Conference on Machine Learning, pp. 1414–1423. PMLR, 2017.
  28. Information theoretic approaches to inference in moment condition models. Econometrica, 66(2):333–357, 1998. ISSN 00129682, 14680262.
  29. An information-theoretic alternative to generalized method of moments estimation. Econometrica, 65(4):861–874, 1997. ISSN 00129682, 14680262.
  30. Kosorok, M. R. Introduction to empirical processes and semiparametric inference, volume 61. Springer, 2008.
  31. Functional generalized empirical likelihood estimation for conditional moment restrictions. In International Conference on Machine Learning, pp. 11665–11682. PMLR, 2022.
  32. Estimation beyond data reweighting: Kernel method of moments. In Proceedings of the 40th International Conference on Machine Learning, volume 202 of Proceedings of Machine Learning Research, pp.  17745–17783. PMLR, 23–29 Jul 2023.
  33. Lam, H. Recovering best statistical guarantees via the empirical divergence-based distributionally robust optimization. Operations Research, 67(4):1090–1105, 2019.
  34. Adversarial generalized method of moments, 2018.
  35. On the limited memory bfgs method for large scale optimization. Mathematical programming, 45(1-3):503–528, 1989.
  36. Universal kernels. Mathematics, 7, 12 2006.
  37. Data-driven distributionally robust optimization using the Wasserstein metric: Performance guarantees and tractable reformulations. Mathematical Programming, 171(1-2):115–166, 2018.
  38. Dual instrumental variable regression, 2020.
  39. Higher order properties of gmm and generalized empirical likelihood estimators. Econometrica, 72(1):219–255, 2004. ISSN 00129682, 14680262.
  40. Otsu, T. Empirical likelihood estimation of conditional moment restriction models with unknown functions. Econometric Theory, 27(1):8–46, 2011.
  41. Owen, A. Empirical likelihood ratio confidence regions. The Annals of Statistics, 18(1):90–120, 1990. ISSN 00905364.
  42. Owen, A. B. Empirical likelihood ratio confidence intervals for a single functional. Biometrika, 75(2):237–249, 1988. ISSN 00063444.
  43. Owen, A. B. Empirical likelihood. Chapman and Hall/CRC, 2001.
  44. Pearl, J. Causality: Models, Reasoning, and Inference. Cambridge University Press, New York, NY, USA, 2000.
  45. Computational optimal transport: With applications to data science. Foundations and Trends® in Machine Learning, 11(5-6):355–607, 2019.
  46. Empirical likelihood and general estimating equations. The Annals of Statistics, 22(1):300–325, 1994. ISSN 00905364.
  47. Exploiting independent instruments: Identification and distribution generalization. In International Conference on Machine Learning, pp. 18935–18958. PMLR, 2022.
  48. Learning with kernels: support vector machines, regularization, optimization, and beyond. MIT press, 2002.
  49. A generalized representer theorem. In Computational Learning Theory, pp.  416–426, 2001.
  50. Kernel instrumental variable regression. Advances in Neural Information Processing Systems, 32, 2019.
  51. Certifiable distributional robustness with principled adversarial training. In International Conference on Learning Representations, 2018.
  52. Concerning nonnegative matrices and doubly stochastic matrices. Pacific Journal of Mathematics, 21(2):343–348, 1967.
  53. Sinkhorn distributionally robust optimization, 2023.
  54. Learning deep features in instrumental variable regression. In International Conference on Learning Representations, 2021.
  55. Learning gaussian mixtures using the wasserstein-fisher-rao gradient flow. arXiv preprint arXiv:2301.01766, 2023.
  56. Yarotsky, D. Error bounds for approximations with deep relu networks. Neural Networks, 94:103–114, 2017.
  57. Yarotsky, D. Optimal approximation of continuous functions by very deep relu networks. In Conference on learning theory, pp.  639–649. PMLR, 2018.
  58. Instrumental variable regression via kernel maximum moment loss. Journal of Causal Inference, 11(1):20220073, 2023.
  59. Kernel distributionally robust optimization: Generalized duality theorem and stochastic approximation. In International Conference on Artificial Intelligence and Statistics, pp.  280–288. PMLR, 2021.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com