Papers
Topics
Authors
Recent
2000 character limit reached

Task-Driven Causal Feature Distillation: Towards Trustworthy Risk Prediction (2312.16113v2)

Published 20 Dec 2023 in cs.LG and cs.AI

Abstract: Since artificial intelligence has seen tremendous recent successes in many areas, it has sparked great interest in its potential for trustworthy and interpretable risk prediction. However, most models lack causal reasoning and struggle with class imbalance, leading to poor precision and recall. To address this, we propose a Task-Driven Causal Feature Distillation model (TDCFD) to transform original feature values into causal feature attributions for the specific risk prediction task. The causal feature attribution helps describe how much contribution the value of this feature can make to the risk prediction result. After the causal feature distillation, a deep neural network is applied to produce trustworthy prediction results with causal interpretability and high precision/recall. We evaluate the performance of our TDCFD method on several synthetic and real datasets, and the results demonstrate its superiority over the state-of-the-art methods regarding precision, recall, interpretability, and causality.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (38)
  1. Bishop, C. M. 1994. Mixture density networks.
  2. Breiman, L. 2001. Random forests. Machine learning, 45(1): 5–32.
  3. Adversarial attacks and defences: A survey. arXiv preprint arXiv:1810.00069.
  4. Neural network attributions: A causal perspective. In International Conference on Machine Learning, 981–990. PMLR.
  5. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, 785–794.
  6. More Knowledge, Less Bias: Unbiasing Scene Graph Generation with Explicit Ontological Adjustment. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 4023–4032.
  7. Estimating propensity scores with deep adaptive variable selection. In Proceedings of the 2023 SIAM International Conference on Data Mining (SDM), 730–738. SIAM.
  8. Data-Centric Financial Large Language Models. arXiv preprint arXiv:2310.17784.
  9. Causal effect estimation: Recent advances, challenges, and opportunities. arXiv preprint arXiv:2302.00848.
  10. Matching in selective and balanced representation space for treatment effects estimation. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management, 205–214.
  11. Graph infomax adversarial learning for treatment effect estimation with networked observational data. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, 176–184.
  12. Learning infomax and domain-independent representations for causal effect inference with real-world data. In Proceedings of the 2022 SIAM International Conference on Data Mining (SDM), 433–441. SIAM.
  13. K-nearest neighbour classifiers-a tutorial. ACM Computing Surveys (CSUR), 54(6): 1–25.
  14. Greenland, S. 2008. Invited commentary: variable selection versus shrinkage in the control of multiple confounders. American journal of epidemiology, 167(5): 523–529.
  15. Badsam: Exploring security vulnerabilities of sam via backdoor attacks. arXiv preprint arXiv:2305.03289.
  16. Attacking Neural Networks with Neural Networks: Towards Deep Synchronization for Backdoor Attacks. In Proceedings of the 32nd ACM International Conference on Information and Knowledge Management, CIKM ’23, 608–618. New York, NY, USA: Association for Computing Machinery. ISBN 9798400701245.
  17. Heckerman, D. 2008. A tutorial on learning with Bayesian networks. Innovations in Bayesian networks, 33–82.
  18. Exploring strategies for training deep neural networks. Journal of machine learning research, 10(1).
  19. Machine Learning for Causal Inference. Springer Nature.
  20. Backdoor learning: A survey. IEEE Transactions on Neural Networks and Learning Systems.
  21. Regularization methods for high-dimensional instrumental variables regression with an application to genetical genomics. Journal of the American Statistical Association, 110(509): 270–288.
  22. Pharmacygpt: The ai pharmacist. arXiv preprint arXiv:2307.10432.
  23. A review of using machine learning approaches for precision education. Educational Technology & Society, 24(1): 250–266.
  24. Effects of adjusting for instrumental variables on bias and precision of effect estimates. American journal of epidemiology, 174(11): 1213–1222.
  25. The central role of the propensity score in observational studies for causal effects. Biometrika, 70(1): 41–55.
  26. Rubin, D. B. 1974. Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of educational Psychology, 66(5): 688.
  27. A review of covariate selection for non-experimental comparative effectiveness research. Pharmacoepidemiology and drug safety, 22(11): 1139–1145.
  28. Overadjustment bias and unnecessary adjustment in epidemiologic studies. Epidemiology (Cambridge, Mass.), 20(4): 488.
  29. Perfect match: A simple method for learning representations for counterfactual inference with neural networks. arXiv preprint arXiv:1810.00656.
  30. Aging Contrast: A Contrastive Learning Framework for Fish Re-identification Across Seasons and Years. In Australasian Joint Conference on Artificial Intelligence, 252–264. Springer.
  31. Outcome-adaptive lasso: Variable selection for causal inference. Biometrics, 73(4): 1111–1122.
  32. Least squares support vector machine classifiers. Neural processing letters, 9(3): 293–300.
  33. Attention is all you need. Advances in neural information processing systems, 30.
  34. Towards Explainable Visual Anomaly Detection. arXiv preprint arXiv:2302.06670.
  35. Confounder selection via penalized credible regions. Biometrics, 70(4): 852–861.
  36. A survey on causal inference. ACM Transactions on Knowledge Discovery from Data (TKDD), 15(5): 1–46.
  37. Trustworthy Representation Learning Across Domains. arXiv preprint arXiv:2308.12315.
  38. Zou, H. 2006. The adaptive lasso and its oracle properties. Journal of the American statistical association, 101(476): 1418–1429.
Citations (12)

Summary

We haven't generated a summary for this paper yet.

Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.