Learning Optimal and Fair Decision Trees for Non-Discriminative Decision-Making (1903.10598v1)

Published 25 Mar 2019 in cs.LG and stat.ML

Abstract: In recent years, automated data-driven decision-making systems have enjoyed a tremendous success in a variety of fields (e.g., to make product recommendations, or to guide the production of entertainment). More recently, these algorithms are increasingly being used to assist socially sensitive decision-making (e.g., to decide who to admit into a degree program or to prioritize individuals for public housing). Yet, these automated tools may result in discriminative decision-making in the sense that they may treat individuals unfairly or unequally based on membership to a category or a minority, resulting in disparate treatment or disparate impact and violating both moral and ethical standards. This may happen when the training dataset is itself biased (e.g., if individuals belonging to a particular group have historically been discriminated upon). However, it may also happen when the training dataset is unbiased, if the errors made by the system affect individuals belonging to a category or minority differently (e.g., if misclassification rates for Blacks are higher than for Whites). In this paper, we unify the definitions of unfairness across classification and regression. We propose a versatile mixed-integer optimization framework for learning optimal and fair decision trees and variants thereof to prevent disparate treatment and/or disparate impact as appropriate. This translates to a flexible schema for designing fair and interpretable policies suitable for socially sensitive decision-making. We conduct extensive computational studies that show that our framework improves the state-of-the-art in the field (which typically relies on heuristics) to yield non-discriminative decisions at lower cost to overall accuracy.

PDF Abstract

Learning Optimal and Fair Decision Trees for Non-Discriminative Decision-Making

In this paper, the authors propose a robust and versatile method for constructing decision trees that are simultaneously optimal and fair, aimed at mitigating discrimination in automated decision-making systems. The paper addresses a critical concern in the deployment of ML for socially sensitive contexts, where decisions could have substantial ethical and moral consequences, such as employment or public service allocation.

Key Highlights of the Paper

The research introduces a novel mixed-integer optimization (MIO) framework that unifies the formulation of disparate treatment and disparate impact into a coherent model applicable to both classification and regression tasks. The model is particularly innovative because it allows for the customization of fairness constraints without sacrificing decision tree interpretability — a significant advancement over conventional heuristic-based methods.

Technical Contributions

Mathematical Formalization: The paper mathematically formalizes discrimination by defining indices for measuring disparate treatment and disparate impact. This allows for quantifying and addressing discrimination systematically within the ML model.
Unifying Framework: A major contribution is the unification of fairness objectives in a mixed-integer programming framework. The model's flexibility is demonstrated through its ability to optimally balance accuracy and fairness while accommodating both categorical and continuous input variables.
Generalization of Decision Trees: The model extends traditional decision trees by incorporating linear branching and leafing rules, enhancing both flexibility and interpretability. This development surpasses existing MIP-based decision tree models that rely on these features' one-hot encoding.
Customizable Interpretability: The ability to impose constraints on the decision tree structure, such as depth and feature repetition, enables decision-makers to tune the model according to interpretability needs in socially sensitive settings.

Numerical Results and Implications

Extensive computational studies validate the framework's efficacy in producing fair and non-discriminative decision trees with negligible sacrifice to overall accuracy. Experiments on standard datasets, including credit default prediction and income classifications, reveal that the proposed MIP-DT approach outperforms existing fairness-oriented algorithms in both fairness and accuracy measures. The empirical results illustrate a desirable trade-off between fairness and accuracy, with the benefit of enhanced interpretability.

Practical and Theoretical Implications

The proposed methodology holds substantial implications for the design of AI systems in socially sensitive applications. On a practical level, it provides organizations with a tool to ensure compliance with ethical standards and legal expectations concerning discrimination. From a theoretical standpoint, the framework sets a precedent for the use of mixed-integer optimization in balancing complex multi-objective tasks in machine learning, paving the way for further exploration into more intricate fairness constraints and the development of scalable solutions.

Future Directions

While the paper represents a significant step forward in fair machine learning research, several areas for further investigation remain. Future work could explore the scalability of the framework to larger datasets and more complex decision-making scenarios. Additionally, continuing to refine the trade-off between interpretability and complexity could make these tools even more valuable for practitioners dealing with highly sensitive decisions.

In summary, this paper presents a rigorous and flexible approach to construct decision trees that uphold fairness without compromising on accuracy or interpretability, addressing a critical need in the algorithmic decision-making landscape. The ability to tailor the decision-making process to societal standards and ethical norms is an essential leap towards the responsible deployment of AI systems.

PDF Markdown Bookmark Chat (Pro)

Authors (3)

Sina Aghaei (10 papers)
Mohammad Javad Azizi (2 papers)
Phebe Vayanos (21 papers)

Citations (166)

View on Semantic Scholar

Learning Optimal and Fair Decision Trees for Non-Discriminative Decision-Making (1903.10598v1)