Fairness vs Performance: Characterizing the Pareto Frontier of Algorithmic Decision Systems

Published 11 May 2026 in cs.LG, cs.AI, and cs.CY | (2605.10604v1)

Abstract: Designing fair algorithmic decision systems requires balancing model performance with fairness toward affected individuals: More fairness might require sacrificing some performance and vice versa, yet the space of possible trade-offs is still poorly understood. We investigate fairness in binary prediction-based decision problems by conceptualizing decision making as a multi-objective optimization problem that simultaneously considers decision-maker utility and group fairness. We investigate the set of Pareto-optimal decision rules for arbitrary utility functions for decision maker, arbitrary population distributions, and a wide range of group fairness metrics. We find that the Pareto frontier consists of deterministic, group-specific threshold rules applied to individuals' success probability. This complements existing optimality theorems from literature which, for specific fairness constraints, posit lower-bound threshold rules only. However we also show that, depending on the used fairness metric, the Pareto frontier may include upper-bound threshold rules, thus preferring individuals with lower success probabilities. We show that the location of the Pareto frontier depends only on population characteristics, utility functions and fairness score, but not on the technical design of the algorithm - our findings hold for pre-, in-, and post-processing approaches alike. Our results generalize existing optimality theorems for fairness-constrained classification and extend them to generalized fairness metrics and fairness principles, and to partial fairness regimes. This paper connects formal fairness research with legal and ethical requirements to search for less discriminatory alternatives, offering a principled foundation for evaluating and comparing algorithmic decision systems.

Abstract PDF Upgrade to Chat

Authors (2)

Summary

The paper presents a framework that balances fairness and performance by characterizing the Pareto frontier through group-specific threshold rules.
It reveals that both lower-bound and upper-bound thresholds are crucial for achieving optimal utility-fairness trade-offs across diverse fairness metrics.
Empirical comparisons, including on the Adult Income dataset, demonstrate that implicit group-specific thresholding can effectively approximate theoretical optima even without explicit group features.

Fairness-Performance Trade-offs in Algorithmic Decision Systems: A Characterization of the Pareto Frontier

Introduction

This work addresses the longstanding problem of balancing fairness and performance in binary prediction-based algorithmic decision systems. The authors conceptualize this tension as a multi-objective optimization problem in which one simultaneously maximizes decision-maker utility and enforces group fairness across socially salient groups. The paper generalizes previous result classes by providing strict and technology-agnostic characterizations of the Pareto frontier with respect to performance and a wide family of fairness metrics, encompassing not only confusion-matrix-based parity constraints but also metrics derived from distributive justice frameworks (e.g., Rawlsian, prioritarian, sufficientarian).

Framework for Utility and Fairness

Central to the analysis is the abstraction of decision making as a mapping from feature vectors $\vec{x}$ to binary decisions $D$ , with performance quantified by a decision-maker utility function $U(D, Y)$ , and fairness quantified via a generalized fairness score $FS$ built upon decision-subject utility matrices $V(D, Y)$ conditional on protected attributes. The framework allows arbitrary population distributions and enables full generality for group-dependent utility functions and fairness indices, including but not restricted to classical confusion matrix rates (selection rate, TPR, FPR, PPV, etc.).

A major insight is that classical parity notions (e.g., demographic parity, equal opportunity) can be shown to be representable as instances of expectation values of $V$ , broadening analytical tractability for more general, impact-based fairness requirements.

Characterization of Pareto-Optimality

The authors rigorously prove that for all fairness metrics expressible as group-conditioned expectation functionals of $V$ , every Pareto-optimal policy in the utility-fairness space has the structural form of a deterministic group-specific threshold rule applied to the estimated success probability $p(\vec{x}) = P[Y=1|\vec{x}]$ . The boundaries of the Pareto frontier are thus constructed from all possible combinations of group-specific lower-bound and upper-bound thresholds on $p(\vec{x})$ . This result subsumes prior optimality results, which typically only consider group-specific lower-bound thresholds, and demonstrates that upper-bound thresholding (i.e., cherry-picking or within-group unfairness) can be necessary for Pareto-optimality under certain metrics and settings.

Behavior of the Pareto Frontier

Through synthetic and empirical studies, the paper illustrates how the structure of the Pareto frontier is determined strictly by population-level characteristics ( $g(p|a)$ , the distribution of $D$ 0 within each group), utility matrices, and the fairness score definition, irrespective of the technical realization (pre-, in-, or post-processing). The role of thresholding—its direction and group-dependence—varies fundamentally with the chosen fairness criterion:

For selection-rate-based fairness, the frontier consists entirely of lower-bound thresholds for all groups.
When subject utility matrices capture substantial differential impact or harm across $D$ 1 pairs, frontiers requiring upper-bound thresholds for some groups arise, representing scenarios where the highest utility-fairness combinations are achieved by preferring lower-probability individuals in particular groups.

This is visually depicted in:

Figure 1: The Pareto frontier for different settings of DS utility matrices: color encodes the types of group-specific threshold rules comprising the non-dominated frontier.

Theoretical and Practical Implications

Theorem (Threshold Characterization): Every Pareto-optimal policy is a group-dependent deterministic threshold rule on $D$ 2; each group may require either a lower-bound or an upper-bound threshold.

This establishes a fundamental upper bound for achievable fairness-performance trade-offs in binary classification settings with group fairness constraints. The result is implementation-agnostic: any system architecture capable of representing group-dependent thresholds can, in principle, reach points on the frontier.

Further, the frontier may include thresholding strategies that traditional approaches—restricted to lower-bound rules—fail to realize. The occurrence of upper-bound thresholding is rooted in the shape of the DS utility matrix and the chosen fairness principle; specifically, when optimizing for certain metrics, optimality is achieved by allocating positive decisions against conventional ML intuition (i.e., favoring those less likely to succeed).

Comparative Study: Empirical Frontier Construction and Algorithmic Implications

An empirical comparison is made between the constructive threshold-combination method derived from the theory and the PF-SMG algorithm—a stochastic multi-objective in-processing approach—on the Adult Income dataset. Using a logistic regression model trained without access to sensitive attributes, the authors show that their approach strictly dominates PF-SMG in the accuracy-fairness space, and that individually-trained in-processing methods tend to approximate group-specific threshold rules even without explicit access to group membership.

Figure 2: Comparison of Pareto frontiers on the Adult Income dataset between the PF-SMG algorithm and the constructive threshold-combination approach.

A key insight is that group-specific thresholding can emerge implicitly, even in models trained without explicit group features, through correlations in non-sensitive features, supporting the theoretical claim that access to $D$ 3 is sufficient but not necessary for realizing these rules.

Population Distributions and Generalization

The theoretical framework leverages the population distribution over $D$ 4 within each group ( $D$ 5), showing that the precise shape of the Pareto frontier can be determined with high accuracy solely from these distributions, without requiring a Bayes-optimal predictor $D$ 6 itself. This reduces the complexity of frontier estimation and paves the way for new ML strategies that focus on modeling $D$ 7 rather than direct outcome prediction.

Figure 3: Distributions $D$ 8 of $D$ 9 for the two protected groups, showcasing population heterogeneity as the sole technical determinant of achievable trade-offs.

Discussion and Future Directions

The work has significant implications for the development, assessment, and auditability of algorithmic decision systems:

Benchmarking: The structural characterization of the Pareto frontier establishes a universal benchmark for evaluating proposed fairness-aware decision algorithms. Any system operating far from the theoretical frontier is demonstrably suboptimal in the utility-fairness space.
Auditability and Liability: This benchmark can underpin legal doctrines (e.g., the Less Discriminatory Alternative standard), quantifying whether a given system is unnecessarily discriminatory given the population and utility structure.
Algorithm Design: ML pipelines seeking optimal fairness-performance trade-offs should focus on learning representations and model classes that flexibly implement group-dependent thresholding, including upper-bound rules where justified by utility/fairness definitions.

The paper also raises normative questions about the acceptability of group-specific (particularly upper-bound) rules, given their potential to instantiate within-group unfairness ("cherry-picking"). These results challenge the alignment of statistical fairness objectives with social/legal expectations and suggest a need for careful scrutiny of resulting policies.

A limitation is that the theory is developed for scalar fairness metrics; multi-metric constraints (e.g., Equal Odds) are not covered, but the extension is a clear avenue for subsequent research.

Conclusion

This paper rigorously characterizes the theoretical Pareto frontier for fairness-utility trade-offs in binary algorithmic decision systems, unifying and extending previous optimality results to general fairness metrics and utility formulations. The finding that both lower- and upper-bound, group-specific threshold policies can constitute the efficient frontier has immediate implications for algorithm design, auditing, and legal compliance. By establishing a universal, population-based, and technology-agnostic benchmark, the work reorients both academic research and practice toward more principled evaluation, transparent reporting, and potentially deeper engagement with ethical questions surrounding fairness in automated decision making (2605.10604).

Markdown Report Issue