Measurement and Fairness (1912.05511v3)

Published 11 Dec 2019 in cs.CY and cs.LG

Abstract: We propose measurement modeling from the quantitative social sciences as a framework for understanding fairness in computational systems. Computational systems often involve unobservable theoretical constructs, such as socioeconomic status, teacher effectiveness, and risk of recidivism. Such constructs cannot be measured directly and must instead be inferred from measurements of observable properties (and other unobservable theoretical constructs) thought to be related to them -- i.e., operationalized via a measurement model. This process, which necessarily involves making assumptions, introduces the potential for mismatches between the theoretical understanding of the construct purported to be measured and its operationalization. We argue that many of the harms discussed in the literature on fairness in computational systems are direct results of such mismatches. We show how some of these harms could have been anticipated and, in some cases, mitigated if viewed through the lens of measurement modeling. To do this, we contribute fairness-oriented conceptualizations of construct reliability and construct validity that unite traditions from political science, education, and psychology and provide a set of tools for making explicit and testing assumptions about constructs and their operationalizations. We then turn to fairness itself, an essentially contested construct that has different theoretical understandings in different contexts. We argue that this contestedness underlies recent debates about fairness definitions: although these debates appear to be about different operationalizations, they are, in fact, debates about different theoretical understandings of fairness. We show how measurement modeling can provide a framework for getting to the core of these debates.

Authors (2)

Abigail Z. Jacobs (21 papers)
Hanna Wallach (48 papers)

Citations (353)

View on Semantic Scholar

Summary

The paper offers a novel measurement modeling framework that reveals mismatches between theoretical fairness constructs and their operationalization in computational systems.
It introduces fairness-oriented constructs of reliability and validity, emphasizing consistency and accuracy in models applied to domains like teacher effectiveness.
The research advocates for explicit assumption testing in computational design, urging developers to preemptively identify and mitigate fairness-related harms.

Analyzing Fairness in Computational Systems: Insights from Measurement Modeling

The paper "Measurement and Fairness" by Abigail Z. Jacobs and Hanna Wallach offers a robust examination of fairness within computational systems using the lens of measurement modeling derived from the quantitative social sciences. The authors assert that numerous fairness-related harms in these systems stem from mismatches between theoretical constructs and their operationalization due to assumptions made in the measurement modeling process.

Measurement modeling involves using statistical models to relate unobservable theoretical constructs, such as socioeconomic status or teacher effectiveness, to observable properties deemed related. This process, although commonplace in the social sciences, is often overlooked in computer science when scrutinizing the fairness of these systems.

The authors present a compelling argument that many harms often discussed in the literature could have been anticipated and mitigated if viewed through a measurement modeling perspective. Through this lens, they contribute fairness-oriented conceptualizations of construct reliability and construct validity. These provide tools for making assumptions explicit and for testing them, which is crucial for understanding potential unfairness embedded within computational models.

Construct Reliability and Validity

Construct reliability is akin to measurement precision, focusing on the consistency of results when measuring the same construct under similar conditions. The authors emphasize test-retest reliability, considering whether a system consistently produces similar results thus indicating reliability. For instance, numerous value-added models for teacher effectiveness have shown variance when assessing the same teacher year-on-year indicating a lack of test-retest reliability.

Construct validity, likened to measurement accuracy, involves ensuring that a model measures what it purports to. This involves several facets: content validity (ensuring comprehensiveness in capturing the construct), convergent validity (aligning with other measurements of the same construct), discriminant validity (proof that a measure is not capturing unintended constructs), predictive validity (predicting related outcomes), hypothesis validity, and consequential validity (considering the societal impacts of the measure).

The authors critique various models, such as those used to estimate socioeconomic status or judge teacher effectiveness, for their oversights in these areas of validity. Notably, they argue that measurement processes in these systems often inadequately capture the construct of interest due to unacknowledged assumptions or omitted aspects, leading to a cascade of potential harms when applied in real-world scenarios.

Fairness as an Essentially Contested Construct

Fairness itself is posited as an unobservable theoretical construct, one that is contested across contexts and theoretical understandings. The authors argue that many debates about fairness definitions—whether individual fairness (similar individuals should be treated similarly) or group fairness (different groups should be treated similarly)—are not debates about operationalizations alone but also about underlying values and theoretical understandings of fairness.

The contested nature of fairness complicates its operationalization in computational systems. The paper discusses the infamous COMPAS risk assessment tool, illustrating how debates about error-rate balance and predictive parity are essentially debates about different fairness understandings. The authors propose that measurement modeling could clarify these debates by emphasizing the distinctions between operational overlaps and theoretical divergences.

Implications for Practice and Future Developments

This paper challenges computational practitioners to incorporate measurement modeling into system designs, emphasizing transparency and accountability. The enlightening nature of measurement modeling surfaces assumptions previously obscured, providing a clearer lens through which to view potential fairness-related harms.

Future AI developments should prioritize explicit and rigorous tests of reliability and validity for theoretical constructs and their applications. By doing so, developers can anticipate systemic bias implications and address fairness-related issues before system deployment. Moreover, existing models can be re-evaluated using this framework to align more closely with ethical and fairness goals.

Ultimately, while the challenge of defining and measuring theoretical constructs like fairness remains, employing the methodologies from measurement modeling ensures a more robust, careful approach to computational system design. This research underscores the essential role of interdisciplinary knowledge bridging social science principles and computer science practice to create equitable computational systems.

PDF Markdown

Related Papers

Tweets

https://twitter.com/AthiyaD/status/1927478840722395139

YouTube

Show All Videos