Papers
Topics
Authors
Recent
2000 character limit reached

Simplicity Bias Leads to Amplified Performance Disparities

Published 13 Dec 2022 in cs.LG and cs.CY | (2212.06641v2)

Abstract: Which parts of a dataset will a given model find difficult? Recent work has shown that SGD-trained models have a bias towards simplicity, leading them to prioritize learning a majority class, or to rely upon harmful spurious correlations. Here, we show that the preference for "easy" runs far deeper: A model may prioritize any class or group of the dataset that it finds simple-at the expense of what it finds complex-as measured by performance difference on the test set. When subsets with different levels of complexity align with demographic groups, we term this difficulty disparity, a phenomenon that occurs even with balanced datasets that lack group/label associations. We show how difficulty disparity is a model-dependent quantity, and is further amplified in commonly-used models as selected by typical average performance scores. We quantify an amplification factor across a range of settings in order to compare disparity of different models on a fixed dataset. Finally, we present two real-world examples of difficulty amplification in action, resulting in worse-than-expected performance disparities between groups even when using a balanced dataset. The existence of such disparities in balanced datasets demonstrates that merely balancing sample sizes of groups is not sufficient to ensure unbiased performance. We hope this work presents a step towards measurable understanding of the role of model bias as it interacts with the structure of data, and call for additional model-dependent mitigation methods to be deployed alongside dataset audits.

Citations (10)

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (2)

Collections

Sign up for free to add this paper to one or more collections.