Contagion Effect Estimation Using Proximal Embeddings (2306.02479v3)
Abstract: Contagion effect refers to the causal effect of peers' behavior on the outcome of an individual in social networks. Contagion can be confounded due to latent homophily which makes contagion effect estimation very hard: nodes in a homophilic network tend to have ties to peers with similar attributes and can behave similarly without influencing one another. One way to account for latent homophily is by considering proxies for the unobserved confounders. However, as we demonstrate in this paper, existing proxy-based methods for contagion effect estimation have a very high variance when the proxies are high-dimensional. To address this issue, we introduce a novel framework, Proximal Embeddings (ProEmb), that integrates variational autoencoders with adversarial networks to create low-dimensional representations of high-dimensional proxies and help with identifying contagion effects. While VAEs have been used previously for representation learning in causal inference, a novel aspect of our approach is the additional component of adversarial networks to balance the representations of different treatment groups, which is essential in causal inference from observational data where these groups typically come from different distributions. We empirically show that our method significantly increases the accuracy and reduces the variance of contagion effect estimation in observational network data compared to state-of-the-art methods.
- Alberto Abadie and Guido W Imbens. 2006. Large sample properties of matching estimators for average treatment effects. econometrica 74, 1 (2006), 235–267.
- Joshua D Angrist and Guido W Imbens. 1995. Two-stage least squares estimation of average causal effects in models with variable treatment intensity. Journal of the American statistical Association 90, 430 (1995), 431–442.
- Counterfactual representation learning with balancing weights. In International Conference on Artificial Intelligence and Statistics. PMLR, 1972–1980.
- Bayesian factor regression models in the “large p, small n” paradigm. Bayesian statistics 7 (2003), 733–742.
- Latent Dirichlet Allocation. Journal of Machine Learning Research 3, Jan (2003), 993–1022. https://www.jmlr.org/papers/v3/blei03a
- Identification of peer effects through social networks. Journal of econometrics 150, 1 (2009), 41–55.
- Socio-Linguistic Characteristics of Coordinated Inauthentic Accounts. arXiv preprint arXiv:2305.11867 (2023).
- John C Chao and Norman R Swanson. 2005. Consistent estimation with a large number of weak instruments. Econometrica 73, 5 (2005), 1673–1692.
- Nicholas A Christakis and James H Fowler. 2007. The spread of obesity in a large social network over 32 years. New England journal of medicine 357, 4 (2007), 370–379.
- Nicholas A Christakis and James H Fowler. 2008. The collective dynamics of smoking in a large social network. New England journal of medicine 358, 21 (2008), 2249–2258.
- Irina Cristali and Victor Veitch. 2021. Using Embeddings to Estimate Peer Influence on Social Networks. In NeurIPS 2021.
- Covariate selection for the nonparametric estimation of an average treatment effect. Biometrika 98, 4 (2011), 861–875.
- Ben Deaner. 2021. Many Proxy Controls. arXiv preprint arXiv:2110.03973 (2021).
- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers).
- Estimating peer effects in networks with peer encouragement designs. PNAS (2016).
- Naoki Egami and Eric J Tchetgen Tchetgen. 2021. Identification and Estimation of Causal Peer Effects Using Double Negative Controls for Unmeasured Network Confounding. arXiv preprint arXiv:2109.01933 (2021).
- Understanding Stay-at-home Attitudes through Framing Analysis of Tweets. IEEE International Conference on Data Science and Advanced Analytics (DSAA) (2022).
- James H Fowler and Nicholas A Christakis. 2008. Estimating peer effects on health in social networks: a response to Cohen-Cole and Fletcher; Trogdon, Nonnemaker, Pais. Journal of health economics 27, 5 (2008), 1400.
- Generative Adversarial Nets. In Advances in Neural Information Processing Systems, Z. Ghahramani, M. Welling, C. Cortes, N. Lawrence, and K.Q. Weinberger (Eds.), Vol. 27. Curran Associates, Inc. https://proceedings.neurips.cc/paper˙files/paper/2014/file/5ca3e9b122f61f8f06494c97b1afccf3-Paper.pdf
- Fairness without the Sensitive Attribute via Causal Variational Autoencoder. In Thirty-First International Joint Conference on Artificial Intelligence {normal-{\{{IJCAI-22}normal-}\}}. International Joint Conferences on Artificial Intelligence Organization, 696–702.
- Draw: A recurrent neural network for image generation. In International conference on machine learning. PMLR, 1462–1471.
- Learning individual causal effects from networked observational data. In WSDM. 232–240.
- Estimation with many instrumental variables. Journal of Business & Economic Statistics 26, 4 (2008), 398–422.
- Negar Hassanpour and Russell Greiner. 2019. CounterFactual Regression with Importance Sampling Weights.. In IJCAI. 5880–5887.
- Causal effect variational autoencoder with uniform treatment. arXiv preprint arXiv:2111.08656 (2021).
- Song Jiang and Yizhou Sun. 2022a. Estimating Causal Effects on Networked Observational Data via Representation Learning. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management. 852–861.
- Song Jiang and Yizhou Sun. 2022b. Estimating Causal Effects on Networked Observational Data via Representation Learning. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management. 852–861.
- Unsupervised learning of 3d structure from images. Advances in neural information processing systems 29 (2016).
- Learning Representations for Counterfactual Inference. In ICML- Volume 48 (New York, NY, USA) (ICML’16). JMLR.org, 3020–3029.
- Counterfactual fairness with disentangled causal effect variational autoencoder. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 8128–8136.
- Diederik P Kingma and Max Welling. 2014. Auto-Encoding Variational Bayes. stat 1050 (2014), 1.
- Brian V Krauth. 2005. Peer effects and selection effects on smoking among Canadian youth. Canadian Journal of Economics/Revue canadienne d’économique 38, 3 (2005), 735–757.
- Solomon Kullback and Richard A Leibler. 1951. On information and sufficiency. The annals of mathematical statistics 22, 1 (1951), 79–86.
- Metalearners for estimating heterogeneous treatment effects using machine learning. Proceedings of the national academy of sciences 116, 10 (2019), 4156–4165.
- Steffen L Lauritzen and Thomas S Richardson. 2002. Chain graph models and their causal interpretations. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 64, 3 (2002), 321–348.
- Sheng Li and Yun Fu. 2017. Matching on balanced nonlinear representations for treatment effects estimation. Advances in Neural Information Processing Systems 30 (2017).
- Causal effect inference with deep latent-variable models. Advances in neural information processing systems 30 (2017).
- Charles F Manski. 1993. Identification of endogenous social effects: The reflection problem. The review of economic studies 60, 3 (1993), 531–542.
- Dating and changes in adolescent cigarette smoking: does partner smoking behavior matter? Nicotine & Tobacco Research 11, 10 (2009), 1226–1230.
- Adversarial variational bayes: Unifying variational autoencoders and generative adversarial networks. In International conference on machine learning. PMLR, 2391–2400.
- Identifying causal effects with proxy variables of an unmeasured confounder. Biometrika 105, 4 (2018), 987–993.
- Causal inference for social network data. arXiv preprint arXiv:1705.08527 (2017).
- Elizabeth L Ogburn and Tyler J VanderWeele. 2014. Causal diagrams for interference. Statistical science 29, 4 (2014), 559–578.
- Judea Pearl. 2009. Causality. Cambridge Univ Press.
- GloVe: Global Vectors for Word Representation. In EMNLP. 1532–1543. http://www.aclweb.org/anthology/D14-1162
- Networks with growth and preferential attachment: modelling and applications. Journal of Complex Networks 9, 1 (2021), cnab008.
- Dimensionality reduction of SDSS spectra with variational autoencoders. The Astronomical Journal 160, 1 (2020), 45.
- Variational autoencoder for deep learning of images, labels and captions. Advances in neural information processing systems 29 (2016).
- Stochastic backpropagation and approximate inference in deep generative models. In International conference on machine learning. PMLR, 1278–1286.
- Characterizing and detecting hateful users on twitter. In ICWSM.
- Paul R Rosenbaum and Donald B Rubin. 1983. The central role of the propensity score in observational studies for causal effects. Biometrika 70, 1 (1983), 41–55.
- Estimating individual treatment effect: generalization bounds and algorithms. In International Conference on Machine Learning. PMLR, 3076–3085.
- Cosma Rohilla Shalizi and Edward McFowland III. 2016. Estimating causal peer influence in homophilous social networks by inferring latent locations. arXiv preprint arXiv:1607.06565 (2016).
- Cosma Rohilla Shalizi and Andrew C Thomas. 2011. Homophily and contagion are generically confounded in observational social network studies. Sociological methods & research 40, 2 (2011), 211–239.
- Elizabeth A Stuart. 2010. Matching methods for causal inference: A review and a look forward. Statistical science 25, 1 (2010), 1.
- An introduction to proximal causal learning. arXiv preprint arXiv:2009.10982 (2020).
- Epistemology in the Era of Fake News: An Exploration of Information Verification Behaviors among Social Networking Site Users. SIGMIS Database 49, 3 (jul 2018), 78–97. https://doi.org/10.1145/3242734.3242740
- Tyler J VanderWeele and Weihua An. 2013. Social networks and causal inference. Handbook of causal analysis for social research (2013), 353–374.
- Using embeddings to correct for unobserved confounding in networks. Neurips 32 (2019).
- Generalized autoencoder: A neural network framework for dimensionality reduction. In CVPR workshops. 490–497.