Language (Technology) is Power: A Critical Survey of "Bias" in NLP (2005.14050v2)

Published 28 May 2020 in cs.CL and cs.CY

Abstract: We survey 146 papers analyzing "bias" in NLP systems, finding that their motivations are often vague, inconsistent, and lacking in normative reasoning, despite the fact that analyzing "bias" is an inherently normative process. We further find that these papers' proposed quantitative techniques for measuring or mitigating "bias" are poorly matched to their motivations and do not engage with the relevant literature outside of NLP. Based on these findings, we describe the beginnings of a path forward by proposing three recommendations that should guide work analyzing "bias" in NLP systems. These recommendations rest on a greater recognition of the relationships between language and social hierarchies, encouraging researchers and practitioners to articulate their conceptualizations of "bias"---i.e., what kinds of system behaviors are harmful, in what ways, to whom, and why, as well as the normative reasoning underlying these statements---and to center work around the lived experiences of members of communities affected by NLP systems, while interrogating and reimagining the power relations between technologists and such communities.

Authors (4)

Su Lin Blodgett (31 papers)
Solon Barocas (39 papers)
Hanna Wallach (48 papers)
Hal Daumé III (76 papers)

Citations (1,088)

View on Semantic Scholar

Summary

Critical Survey of Bias in Natural Language Processing: An Expert Overview

The paper "Language (Technology) is Power: A Critical Survey of `Bias' in NLP" by Blodgett, Barocas, Daumé, and Wallach examines the prevalent issue of bias within NLP systems. The analysis, based on a survey of 146 papers, not only highlights the various inconsistencies and lack of clear motivations across the literature but also proposes recommendations for future research. This essay provides an expert synopsis of the paper, focusing on its methodological rigor, key findings, and the implications for future developments in NLP research.

Methodological Approach

The authors embarked on a comprehensive review of the literature, segregating the focus primarily on written text to maintain a uniform analysis. They utilized systematic keyword searches and citation graph traversals within the ACL Anthology, as well as inspected relevant conferences and workshops. The selected papers were categorized based on their motivations and techniques for quantifying or mitigating bias using an extended taxonomy of harms, differentiating between allocational and representational harms.

Key Findings

Motivations Behind Bias Analysis

Several critical observations were made regarding the motivations driving research on NLP bias:

Diverse, Multiple, and Vague Motivations: Many papers stated broad or ambiguous reasons for addressing bias, often without clear normative grounding. Only a subset of papers articulated explicit normative motivations, such as avoiding discrimination or enhancing model fairness.
Lack of Normative Reasoning: Numerous papers failed to justify why the identified biases were harmful or to whom they caused harm, diluting the overarching goal of reducing socially detrimental effects in NLP systems.
Inconsistent Conceptualization of Bias: Even among papers targeting the same NLP tasks, there were stark differences in how bias was conceptualized and measured, reflecting a lack of consensus in the field.
Conflation of Harm Types: Papers frequently blurred the lines between allocational and representational harms, complicating efforts to address them effectively.

Techniques for Addressing Bias

The techniques proposed in the surveyed papers were often poorly aligned with the stated motivations and were generally not informed by relevant non-NLP literature:

Inadequate Grounding in External Literature: Most techniques failed to leverage insights from fields such as sociolinguistics or social psychology, limiting their efficacy.
Narrow Focus: Techniques predominantly concentrated on system predictions and dataset properties, with minimal attention given to the broader system lifecycle, including task definitions, annotation guidelines, and evaluation metrics.
Misaligned Techniques: While a significant portion of papers highlighted allocational harms in their motivations, only a handful proposed relevant techniques, underscoring a disconnect between problem identification and solution implementation.

Recommendations for Future Research

To address the identified gaps, the authors propose three primary recommendations:

Grounding in Literature on Language and Social Hierarchies: Future research should draw on non-NLP literature to understand how language and social structures co-produce systemic bias. This step is crucial for recognizing representational harms as significant in their own right and not merely as precursors to allocational harms.
Explicit Conceptualizations of Bias: Researchers should provide clear statements on why certain biases are harmful, in what ways, and to whom. This clarity is essential for aligning normative reasoning with technical approaches.
Engagement with Affected Communities: The lived experiences of communities impacted by biased NLP systems should be central to bias analysis. Furthermore, the power dynamics between technologists and these communities should be scrutinized and reimagined.

Implications and Future Directions

The findings and recommendations of this paper have far-reaching implications:

Practical Impact: By grounding bias mitigation techniques in a deeper understanding of language and social hierarchies, NLP systems can be designed to be more equitable and just. This approach can prevent the perpetuation of existing social biases through technology.
Theoretical Contributions: The call for explicit normative reasoning and engagement with affected communities invites a more interdisciplinary approach to NLP research, which could lead to richer, more holistic understandings of bias.

Conclusion

This paper makes a substantial contribution to the field of NLP by critically examining existing research on bias and proposing a path forward grounded in interdisciplinary insights and community engagement. By addressing the inconsistencies and gaps in current approaches, these recommendations aim to pave the way for more robust, equitable, and socially-aware NLP systems. Future research should adopt these guidelines to better understand and mitigate the complex interplay between language technology and social power dynamics.

PDF Markdown

Related Papers

YouTube

Show All Videos