Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
Gemini 2.5 Pro
GPT-5
GPT-4o
DeepSeek R1 via Azure
2000 character limit reached

Data-adaptive exposure thresholds for the Horvitz-Thompson estimator of the Average Treatment Effect in experiments with network interference (2405.15887v2)

Published 24 May 2024 in stat.ME

Abstract: Randomized controlled trials often suffer from interference, a violation of the Stable Unit Treatment Values Assumption (SUTVA) in which a unit's treatment assignment affects the outcomes of its neighbors. This interference causes bias in naive estimators of the average treatment effect (ATE). A popular method to achieve unbiasedness is to pair the Horvitz-Thompson estimator of the ATE with a known exposure mapping: a function that identifies which units in a given randomization are not subject to interference. For example, an exposure mapping can specify that any unit with at least $h$-fraction of its neighbors having the same treatment status does not experience interference. However, this threshold $h$ is difficult to elicit from domain experts, and a misspecified threshold can induce bias. In this work, we propose a data-adaptive method to select the "$h$"-fraction threshold that minimizes the mean squared error of the Hortvitz-Thompson estimator. Our method estimates the bias and variance of the Horvitz-Thompson estimator under different thresholds using a linear dose-response model of the potential outcomes. We present simulations illustrating that our method improves upon non-adaptive choices of the threshold. We further illustrate the performance of our estimator by running experiments on a publicly-available Amazon product similarity graph. Furthermore, we demonstrate that our method is robust to deviations from the linear potential outcomes model.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (46)
  1. Complex Contagions and the Weakenss of Long Ties. American Journal of Sociology, 113(3):702–734, 2007.
  2. Diffusion of innovations in social networks. In 2011 50th IEEE conference on decision and control and European control conference, pages 2329–2334. IEEE, 2011.
  3. Bryony Reich. The diffusion of innovations in social networks. Working paper, University College London, 2016.
  4. Estimating average causal effects under general interference, with application to a social network experiment. 2017.
  5. Graph cluster randomization: Network exposure to multiple universes. In Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 329–337, 2013.
  6. Elements of estimation theory for causal effects in the presence of network interference. arXiv preprint arXiv:1702.03578, 2017.
  7. Estimating spillovers using imprecisely measured networks. arXiv preprint arXiv:1904.00136, 2019.
  8. The local approach to causal inference under network interference. arXiv preprint arXiv:2105.03810, 2021.
  9. Design and analysis of experiments in networks: Reducing bias from interference. Journal of Causal Inference, 5(1):20150021, 2017.
  10. Estimation of causal peer influence effects. In International conference on machine learning, pages 1489–1497. PMLR, 2013.
  11. Model-assisted design of experiments in the presence of network-correlated outcomes. Biometrika, 105(4):849–858, 2018.
  12. Social networks and the decision to insure. American Economic Journal: Applied Economics, 7(2):81–108, 2015.
  13. Integrating active learning in causal inference with interference: A novel approach in online experiments. arXiv preprint arXiv:2402.12710, 2024.
  14. Adaptive estimator selection for off-policy evaluation. In International Conference on Machine Learning, pages 9196–9205. PMLR, 2020.
  15. Variable bandwidth and local linear regression smoothers. The Annals of Statistics, pages 2008–2036, 1992.
  16. David Ruppert. Empirical-bias bandwidths for local polynomial nonparametric regression and density estimation. Journal of the American Statistical Association, 92(439):1049–1062, 1997.
  17. Policy evaluation and optimization with continuous treatments. In International conference on artificial intelligence and statistics, pages 1243–1251. PMLR, 2018.
  18. Bandwidth selection in kernel density estimation: oracle inequalities and adaptive minimax optimality. 2011.
  19. Neighborhood adaptive estimators for causal inference under network interference. arXiv preprint arXiv:2212.03683, 2022.
  20. Double/debiased machine learning for treatment and structural parameters. The Econometrics Journal, 21(1):C1–C68, 01 2018. ISSN 1368-4221. doi: 10.1111/ectj.12097. URL https://doi.org/10.1111/ectj.12097.
  21. Program evaluation with high-dimensional data. Technical report, cemmap working paper, 2015.
  22. Edward H Kennedy. Semiparametric doubly robust targeted double machine learning: a review. arXiv preprint arXiv:2203.06469, 2022.
  23. Weak convergence. Springer, 1996.
  24. Normal approximation by Stein’s method. Springer Science & Business Media, 2010.
  25. Random design analysis of ridge regression. In Conference on learning theory, pages 9–1. JMLR Workshop and Conference Proceedings, 2012.
  26. Michael W Mahoney et al. Randomized algorithms for matrices and data. Foundations and Trends® in Machine Learning, 3(2):123–224, 2011.
  27. Randomized numerical linear algebra: Foundations and algorithms. Acta Numerica, 29:403–572, 2020.
  28. Alwyn Young. Channeling fisher: Randomization tests and the statistical insignificance of seemingly significant experimental results. The quarterly journal of economics, 134(2):557–598, 2019.
  29. Randomized sketches of convex programs with sharp guarantees. IEEE Transactions on Information Theory, 61(9):5096–5115, 2015.
  30. Testing models of social learning on networks: Evidence from two experiments. Econometrica, 88(1):1–32, 2020.
  31. General covariance-based conditions for central limit theorems with dependent triangular arrays. arXiv preprint arXiv:2308.12506, 2023.
  32. The size of the sync basin. Chaos: An Interdisciplinary Journal of Nonlinear Science, 16(1), 2006.
  33. Noga Alon. Eigenvalues, geometric expanders, sorting in rounds, and ramsey theory. Combinatorica, 6(3):207–219, 1986.
  34. Expander flows, geometric embeddings and graph partitioning. Journal of the ACM (JACM), 56(2):1–37, 2009.
  35. Conductance and congestion in power law graphs. In Proceedings of the 2003 ACM SIGMETRICS International Conference on Measurement and modeling of computer systems, pages 148–159, 2003.
  36. On the locality of bounded growth. In Proceedings of the twenty-fourth annual ACM symposium on Principles of distributed computing, pages 60–68, 2005.
  37. The intrinsic dimensionality of graphs. In Proceedings of the thirty-fifth annual ACM symposium on Theory of computing, pages 438–447, 2003.
  38. Emmanuel Kowalski. An introduction to expander graphs. Société mathématique de France Paris, 2019.
  39. The dynamics of viral marketing. ACM Transactions on the Web (TWEB), 1(1):5–es, 2007.
  40. Aad W Van der Vaart. Asymptotic statistics, volume 3. Cambridge university press, 2000.
  41. Alessio Sancetta. Maximal inequalities for u-processes of strongly mixing random variables. Probability and Mathematical Statistics, 29, 2009.
  42. Deriving the asymptotic distribution of u-and v-statistics of dependent data using weighted empirical processes. 2012.
  43. Ken-ichi Yoshihara. Limiting behavior of u-statistics for stationary, absolutely regular processes. Zeitschrift für Wahrscheinlichkeitstheorie und verwandte Gebiete, 35(3):237–252, 1976.
  44. Central limit theorems for empirical and u-processes of stationary mixing sequences. Journal of Theoretical Probability, 7(1):47–71, 1994.
  45. Rigorous statistical procedures for data from dynamical systems. Journal of Statistical Physics, 44:67–93, 1986.
  46. Sourav Chatterjee. Concentration inequalities with exchangeable pairs. Stanford University, 2005.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com