Towards Fairer Datasets: Filtering and Balancing the Distribution of the People Subtree in the ImageNet Hierarchy (1912.07726v1)

Published 16 Dec 2019 in cs.CV

Abstract: Computer vision technology is being used by many but remains representative of only a few. People have reported misbehavior of computer vision models, including offensive prediction results and lower performance for underrepresented groups. Current computer vision models are typically developed using datasets consisting of manually annotated images or videos; the data and label distributions in these datasets are critical to the models' behavior. In this paper, we examine ImageNet, a large-scale ontology of images that has spurred the development of many modern computer vision methods. We consider three key factors within the "person" subtree of ImageNet that may lead to problematic behavior in downstream computer vision technology: (1) the stagnant concept vocabulary of WordNet, (2) the attempt at exhaustive illustration of all categories with images, and (3) the inequality of representation in the images within concepts. We seek to illuminate the root causes of these concerns and take the first steps to mitigate them constructively.

Citations (297)

View on Semantic Scholar

Summary

The paper introduces a filtering process that removed 1,593 offensive synsets from the ImageNet people subtree to mitigate bias.
It employs a novel imageability annotation technique that identified only 158 visually representable synsets out of 2,832.
The study proposes a crowdsourced re-balancing strategy using demographic data to address underrepresentation and enhance dataset fairness.

An Expert Analysis of "Towards Fairer Datasets: Filtering and Balancing the Distribution of the People Subtree in the ImageNet Hierarchy"

The complexities inherent in large-scale datasets have profound implications for the performance and fairness of computer vision models. The paper, "Towards Fairer Datasets: Filtering and Balancing the Distribution of the People Subtree in the ImageNet Hierarchy," presents a meticulous examination of the biases endemic within ImageNet's person subtree and offers potential mitigative methodologies to rectify these disparities. This research, auspicious in its depth, addresses pivotal inequities that could bias machine learning systems if untreated.

Core Issues Identified

The paper outlines three primary issues within ImageNet's person subtree:

Stagnant Concept Vocabulary: The use of the WordNet hierarchy as the backbone for ImageNet has led to the inclusion of outdated and potentially offensive synsets. This phenomenon arises from WordNet's static lexical ontology that fails to align with contemporary societal norms. As a result, a significant portion of synsets under the person category were identified as either offensive or insensitive.
Non-Visual Concepts: Not all synsets within WordNet represent concepts amenable to visual classification. The attempt to provide exhaustive visual representations for all categories may lead to biased datasets that misrepresent the actual concept, potentially augmenting cultural stereotypes.
Lack of Image Diversity: The underrepresentation of certain demographic groups within the dataset correlates with biased image search engine results. When these demographic imbalances propagate into model training data, they manifest as unfairness in predictive performance across groups.

Methodological Approach and Key Findings

The researchers embarked on a rigorous filtering process, which included soliciting annotations from diverse annotators to identify potentially offensive synsets. Through this effort, they flagged and moved to excise 1,593 of the 2,832 people-related synsets in ImageNet as inappropriate for visual representation. They highlighted that simply removing offensive synsets without addressing the representational imbalances would be insufficient.

To address non-visual concepts, the research employed a novel imageability annotation process involving crowd-sourced evaluations—resulting in the identification of a mere 158 imageable synsets from the initial 2,832. This step underscored the importance of ensuring that annotated categories are distinguishably visual.

Regarding demographic representation, the paper conducted a large-scale crowdsourcing initiative to manually annotate gender, skin color, and age for the remaining synsets. The demographic data rendered it possible to propose a re-balance, enabling researchers to adjust image distributions within categories based on desired demographic targets. Recognizing ethical considerations, they abstained from releasing image-level annotations.

Implications and Future Directions

The implications of these findings extend into both practical and theoretical domains of AI. Practically, this work underpins the urgent need to revamp existing datasets that blend heterogeneous data, to prevent models from learning and perpetuating bias. Theoretically, it advances the discourse on fairness, seeking a foundation for more equitable algorithmic development. The paper also sets a precedence for future research, suggesting iterative community involvement to continue refining the datasets by including mechanisms like user feedback for reporting unsafe concepts.

The research emphasizes that any fairness intervention requires a combination of well-curated datasets and conscientious algorithmic enhancement to engender truly unbiased machine learning systems. Future directions could involve extending these methodologies to the broader ImageNet dataset or similar large-scale datasets, ensuring fairer and more ethical AI applications.

In conclusion, this work provides amplifying insights into the structural adjustments necessary for fair representation in AI, crucial for ensuring systems are equitable across demographic groups. It invites reflection and further innovation within the research community to uphold the commitment to fairness and transparency in artificial intelligence.