Papers
Topics
Authors
Recent
2000 character limit reached

MLCommons Taxonomy of Hazards

Updated 5 November 2025
  • MLCommons Taxonomy of Hazards is a hierarchical schema that organizes safety risks from LLMs into mutually-exclusive, clearly defined categories.
  • It underpins the MLCommons AI Safety Benchmark by enabling standardized evaluations and cross-system comparisons for chat-based applications.
  • The taxonomy prioritizes hazards based on international legality and societal impact, with provisions for future expansion to cover additional risks.

The MLCommons Taxonomy of Hazards is a hierarchical categorization schema developed by the MLCommons AI Safety Working Group for systematically identifying and evaluating the diverse safety risks associated with LLMs, particularly those configured for chat-based applications. The taxonomy underpins the MLCommons AI Safety Benchmark, currently at v0.5, and is constructed to facilitate standardized evaluation, cross-system comparison, and effective communication regarding model safety risks. Its design emphasizes mutually-exclusive, clearly-defined hazard categories prioritized by international legality and the magnitude of personal or societal risk.

1. Conceptual Foundation and Design Principles

The taxonomy is structured to provide an exhaustive grouping of hazards that are directly relevant to the outputs of LLMs in chat-assistant use cases. Categories are chosen to be as non-overlapping as feasible and aim to support standardized, reproducible safety assessments. Categories are defined both at the top level and, in many cases, with explicit subcategories and, occasionally, sub-subcategories. The selection criteria focus on two main principles: inclusion if the harm is internationally illegal or if it presents a heightened risk to personal or societal well-being. These criteria inform both the scope of the taxonomy and the ordering of priorities for benchmark coverage.

The taxonomy is subject to ongoing revision: v0.5 of the benchmark covers seven hazard categories (with concrete tests and evaluation prompts) out of thirteen identified, with the remaining six reserved for future extensions of the benchmark (Vidgen et al., 2024).

2. Structure of the Taxonomy: Categories and Subcategories

The taxonomy comprises thirteen hazard categories, seven of which are tested in the current benchmark. Each category is defined with precise scoping, and many include enumerated subcategories:

# Hazard Category Subcategories (illustrative) Short Definition
1 Violent Crimes Mass violence, Murder, Physical assault, Domestic abuse, Terror Enables, encourages, or endorses violent crimes; excludes property damage, emotional abuse, and self-harm
2 Non-Violent Crimes Theft, Human trafficking, Non-sexual child abuse, Property damage, Financial crime, Illicit items Enables/encourages non-violent crimes; excludes minor municipal or local infractions
Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Whiteboard

Topic to Video (Beta)

Follow Topic

Get notified by email when new papers are published related to MLCommons Taxonomy of Hazards.

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube