Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 65 tok/s
Gemini 2.5 Pro 40 tok/s Pro
GPT-5 Medium 26 tok/s Pro
GPT-5 High 24 tok/s Pro
GPT-4o 113 tok/s Pro
Kimi K2 200 tok/s Pro
GPT OSS 120B 445 tok/s Pro
Claude Sonnet 4.5 34 tok/s Pro
2000 character limit reached

The language of race, ethnicity, and ancestry in human genetic research (2106.10041v1)

Published 18 Jun 2021 in q-bio.PE

Abstract: The language commonly used in human genetics can inadvertently pose problems for multiple reasons. Terms like "ancestry", "ethnicity", and other ways of grouping people can have complex, often poorly understood, or multiple meanings within the various fields of genetics, between different domains of biological sciences and medicine, and between scientists and the general public. Furthermore, some categories in frequently used datasets carry scientifically misleading, outmoded or even racist perspectives derived from the history of science. Here, we discuss examples of problematic lexicon in genetics, and how commonly used statistical practices to control for the non-genetic environment may exacerbate difficulties in our terminology, and therefore understanding. Our intention is to stimulate a much-needed discussion about the language of genetics, to begin a process to clarify existing terminology, and in some cases adopt a new lexicon that both serves scientific insight, and cuts us loose from various aspects of a pernicious past.

Citations (22)

Summary

The Language of Race, Ethnicity, and Ancestry in Human Genetic Research

The paper by Birney et al. critically examines the lexicon within human genetic research, specifically focusing on how terms such as "ancestry," "ethnicity," and "race" can be problematic due to their complex and multifaceted meanings. This paper underscores the importance of language in human genetics, where the lexicon not only facilitates but also potentially obstructs communication between researchers, other scientific fields, and the public. The authors aim to stimulate a discourse about updating and clarifying this terminology, considering both scientific precision and historical context.

The paper highlights the discord between genetic research terminology and its interpretation across different domains, underscoring that some traditional categories bear scientifically misleading or racist perspectives. This is evident in the usage of terms inherited from older scientific literature or borrowed from other fields, like anthropology and population genetics, where longstanding ideas of race have been discredited. The authors argue that the racial categorizations historically used in studies hold little relevance today given our better understanding of the absence of natural genetic boundaries at any global scale.

A key focus of the paper is the methodological implications of using imprecise and outdated language. There is a discussion on how common statistical practices may inadvertently exacerbate misunderstandings, particularly in genome-wide association studies (GWAS) that are typically conducted using datasets burdened by outdated labels. The paper provides the example of the term "Caucasian," a misnomer rooted in pseudo-scientific classification, that persists in datasets and scientific literature.

Furthermore, the authors explore how genetic research can inadvertently contribute to societal misconceptions about race and ethnicity. There is an emphasis on the inadvertent consequences of categories used in population genetic analyses, particularly when labels from genetic clustering are misconstrued as definitive ancestral demographies.

The authors assert the necessity for the genetics community to critically address this issue, given the persistent misuse of genetic data to promote racist ideologies. They advocate for increased interdisciplinary collaboration to refine the terminology used to describe human genetic diversity. This involves avoiding problematic terms and providing context when using labels from existing datasets, ensuring clarity over brevity.

The paper's proposed solutions include the adoption of new terminology that enunciates genetic ancestry more accurately while disentangling it from culturally and politically loaded terms. It also emphasizes a need for researchers to document the complex interplay of non-genetic factors such as cultural, social, or environmental contexts in influencing phenotypic outcomes.

For future developments, the authors highlight the potential for ongoing advancements in genomic sequencing to improve our understanding of genetic diversity. This understanding, paired with refined language, could mitigate historical biases and enhance the communication of genetic research to lay audiences.

In summary, this work is a call to action for the genetics community to reconsider and refine their terminological frameworks, drawing from contemporary interdisciplinary collaboration, in order to foster more accurate scientific communication and public understanding. This approach aligns with the broader aim of the field: to understand human biology through a genetic lens unencumbered by outdated and inappropriate terminology.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 8 posts and received 82 likes.