A Community-Informed Approach to Interventions for Misgendering
Introduction
The paper "A Community-Informed Approach to Interventions for Misgendering" presents the development of a novel dataset and framework to address the pervasive issue of misgendering in automated systems. The research is driven by the lack of existing scholarship and tools aimed at combating the detrimental effects of misgendering, especially for gender-diverse individuals. By leveraging insights from a community survey, the authors define a misgendering intervention task and introduce the MisgenderMender dataset to facilitate ongoing research in this area.
Data Collection and Annotation
The MisgenderMender dataset is built on social media content sourced from X (formerly Twitter) and YouTube, and incorporated content generated by several LLMs. Texts involving 30 non-cisgender public figures, whose gender identities and preferred terms are publicly available, were collected. Each instance in the dataset underwent careful human annotation to determine the presence of misgendering, resulting in a comprehensive dataset of 3790 instances.
For the effective annotation of misgendering, annotators were provided with the gender linguistic profiles of individuals, including their preferred pronouns, gender terms, and deadnames (if applicable). Two sub-tasks were defined: (i) detecting misgendering in text, and (ii) correcting misgendering in detected instances, with the latter only applied to LLM-generated content where editing was considered appropriate.
Community Survey
A significant aspect of this research is the community survey conducted among gender-diverse individuals in the US. This survey aimed to understand the lived experiences of misgendering and gather opinions on automated intervention systems. Key findings include:
- Prevalence: Misgendering is most commonly experienced on social media, followed by AI-generated content, news articles, and academic journals.
- Intervention Preferences: There was a strong preference for automatic detection of misgendering, but opinions on automatic correction versus content hiding varied by domain. There was a greater acceptance of automatic corrections in AI-generated content compared to social media, where freedom of expression and the potential for false allyship were major concerns.
- Concerns: Fundamental feasibility issues, privacy, risk of profiling, and limitations of NLP systems in handling nuanced language were significant concerns among the participants.
Task Definition and Evaluation
The dataset enabled the evaluation of existing NLP systems on the task of detecting misgendering. Initial benchmarks were established using several methods:
- Prompting LLMs: Few-shot chain-of-thought prompting was applied to GPT-4, achieving the highest F1-scores among the evaluated models but indicating substantial room for improvement. GPT-4’s performance across various domains (X posts: 62.6, YouTube Comments: 85.3, LLM-generations: 55.9) shows that detection in diverse contexts remains challenging.
- Toxicity Detection: The Perspective API's ability to detect misgendering was limited as it primarily identified general toxicity rather than the nuanced instances of misgendering.
- Rule-based Methods: Utilizing coreference resolution, the rule-based approaches showed varied success, often misidentifying coreference clusters.
For editing misgendered text, GPT-4's performance was promising. It successfully corrected 97% of misgendering instances with minimal unnecessary edits, although these edits were primarily limited to single sentences without broader contextual consideration.
Implications and Future Work
The contributions of this paper are twofold: defining a critical but under-researched task, and introducing a robust dataset to support it. The paper has significant implications for the development of more inclusive and accurate NLP systems. By incorporating community feedback into the design process, the research emphasizes the importance of creating systems that align with the diverse needs and concerns of gender-diverse individuals.
Future work could expand the scope of the dataset to include other domains such as news articles and academic content, address the limitations of current NLP systems in understanding nuanced language, and explore more sophisticated correction mechanisms that consider broader contexts. Furthermore, close collaboration with gender-diverse communities will be essential in developing tools that users can trust and feel safe using.
Conclusion
This research addresses a critical gap by providing a community-informed framework and dataset for tackling misgendering in text. The MisgenderMender dataset and initial benchmarks lay the groundwork for future advancements in creating more respectful and accurate language technologies. The release of the dataset and tools at MisgenderMender aims to encourage continued research and development in this vital area.