Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 153 tok/s
Gemini 2.5 Pro 48 tok/s Pro
GPT-5 Medium 29 tok/s Pro
GPT-5 High 31 tok/s Pro
GPT-4o 76 tok/s Pro
Kimi K2 169 tok/s Pro
GPT OSS 120B 441 tok/s Pro
Claude Sonnet 4.5 39 tok/s Pro
2000 character limit reached

Domain Adaptive Decision Trees: Implications for Accuracy and Fairness (2302.13846v2)

Published 27 Feb 2023 in cs.LG, cs.IT, and math.IT

Abstract: In uses of pre-trained machine learning models, it is a known issue that the target population in which the model is being deployed may not have been reflected in the source population with which the model was trained. This can result in a biased model when deployed, leading to a reduction in model performance. One risk is that, as the population changes, certain demographic groups will be under-served or otherwise disadvantaged by the model, even as they become more represented in the target population. The field of domain adaptation proposes techniques for a situation where label data for the target population does not exist, but some information about the target distribution does exist. In this paper we contribute to the domain adaptation literature by introducing domain-adaptive decision trees (DADT). We focus on decision trees given their growing popularity due to their interpretability and performance relative to other more complex models. With DADT we aim to improve the accuracy of models trained in a source domain (or training data) that differs from the target domain (or test data). We propose an in-processing step that adjusts the information gain split criterion with outside information corresponding to the distribution of the target population. We demonstrate DADT on real data and find that it improves accuracy over a standard decision tree when testing in a shifted target population. We also study the change in fairness under demographic parity and equal opportunity. Results show an improvement in fairness with the use of DADT.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (35)
  1. Bayesian entropy estimation for countable discrete distributions. J. Mach. Learn. Res. 15, 1 (2014), 2833–2868.
  2. Fairness and Machine Learning. fairmlbook.org. http://www.fairmlbook.org.
  3. Joy Buolamwini and Timnit Gebru. 2018. Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification. In FAT (Proceedings of Machine Learning Research, Vol. 81). PMLR, 77–91.
  4. Thomas M. Cover and Joy A. Thomas. 2001. Elements of Information Theory. Wiley.
  5. Catherine D’Ignazio and Lauren F. Klein. 2020. Data Feminism. The MIT Press.
  6. Retiring Adult: New Datasets for Fair Machine Learning. In NeurIPS. 6478–6490.
  7. Is There a Trade-Off Between Fairness and Accuracy? A Perspective Using Mismatched Hypothesis Testing. In ICML (Proceedings of Machine Learning Research, Vol. 119). PMLR, 2803–2813.
  8. European Commission. 2021. Proposal for a Regulation of the European Parliament and of the Council Laying Down Harmonised Rules on Artificial Intelligence (Artificial Intelligence Act) and Amending Certain Union Legislative Acts. https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=celex%3A52021PC0206, accessed on January 2nd, 2023.
  9. Igor Goldenberg and Geoffrey I. Webb. 2019. Survey of distance measures for quantifying concept drift and shift in numeric data. Knowl. Inf. Syst. 60, 2 (2019), 591–615.
  10. Why do tree-based models still outperform deep learning on typical tabular data?. In NeurIPS.
  11. Silviu Guiasu. 1971. Weighted Entropy. Reports on Mathematical Physic 2, 3 (1971).
  12. Equality of Opportunity in Supervised Learning. In Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, December 5-10, 2016, Barcelona, Spain.
  13. The elements of statistical learning: data mining, inference, and prediction. Vol. 2. Springer.
  14. Algorithmic Design: Fairness Versus Accuracy. In EC. ACM, 58–59.
  15. Does enforcing fairness mitigate biases caused by subpopulation shift?. In NeurIPS. 25773–25784.
  16. Tomasz Maszczyk and Wlodzislaw Duch. 2008. Comparison of Shannon, Renyi and Tsallis Entropy Used in Decision Trees. In ICAISC (Lecture Notes in Computer Science, Vol. 5097). Springer, 643–651.
  17. A unifying view on dataset shift in classification. Pattern Recognit. 45, 1 (2012), 521–530.
  18. Domain Adaptation meets Individual Fairness. And they get along. In NeurIPS.
  19. Entropy and Inference, Revisited. In NIPS. MIT Press, 471–478.
  20. Sebastian Nowozin. 2012. Improved Information Gain Estimates for Decision Tree Induction. In ICML. icml.cc / Omnipress.
  21. Vipin Kumar Pang-Ning Tan, Michael Steinbach. 2006. Introduction to Data Mining. Addison Wesley.
  22. Dataset shift in machine learning. MIT Press.
  23. A survey on domain adaptation theory. CoRR abs/2004.11829 (2020).
  24. Cynthia Rudin. 2016. A renaissance for decision tree learning. https://www.youtube.com/watch?v=bY7WEr6lcuY. Keynote at PAPIs 2016..
  25. Cynthia Rudin. 2019. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat. Mach. Intell. 1, 5 (2019), 206–215.
  26. Thomas Schürmann. 2004. Bias analysis in entropy estimation. Journal of Physics A: Mathematical and General 37, 27 (2004), L295–L301. https://doi.org/10.1088/0305-4470/37/27/l02
  27. Edward H Simpson. 1951. The interpretation of interaction in contingency tables. Journal of the Royal Statistical Society: Series B (Methodological) 13, 2 (1951), 238–241.
  28. A weighted information-gain measure for ordinal classification trees. Expert Syst. Appl. 152 (2020), 113375.
  29. Harini Suresh and John V. Guttag. 2021. A Framework for Understanding Sources of Harm throughout the Machine Learning Life Cycle. In EAAMO. ACM, 17:1–17:9.
  30. How fair can we go in machine learning? Assessing the boundaries of accuracy and fairness. Int. J. Intell. Syst. 36, 4 (2021), 1619–1643.
  31. João Vieira and Cláudia Antunes. 2014. Decision Tree Learner in the Presence of Domain Knowledge. In CSWS (Communications in Computer and Information Science, Vol. 480). Springer, 42–55.
  32. White House. 2022. Blueprint for an AI Bill of Rights. https://www.whitehouse.gov/ostp/ai-bill-of-rights/. Accessed on January 2nd, 2023.
  33. Domain Adaptation under Target and Conditional Shift. In ICML (3) (JMLR Workshop and Conference Proceedings, Vol. 28). JMLR.org, 819–827.
  34. Wenbin Zhang and Albert Bifet. 2020. FEAT: A Fairness-Enhancing and Concept-Adapting Decision Tree Classifier. In DS (Lecture Notes in Computer Science, Vol. 12323). Springer, 175–189.
  35. Wenbin Zhang and Eirini Ntoutsi. 2019. FAHT: An Adaptive Fairness-aware Decision Tree Classifier. In IJCAI. ijcai.org, 1480–1486.
Citations (5)

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 1 tweet and received 0 likes.

Upgrade to Pro to view all of the tweets about this paper: