OpenRubrics Dataset Overview
- OpenRubrics is a structured dataset with 53,398 prompt–rubric pairs featuring both hard rules and principles for multi-dimensional evaluations.
- Its Contrastive Rubric Generation framework extracts explicit and implicit criteria from response pairs, achieving a 98.2% preference-label consistency.
- Benchmark results demonstrate rubric-based models can improve performance by up to 9.5% in specialized tasks, strengthening LLM alignment.
OpenRubrics is a large-scale, diverse dataset for scalable synthetic rubric generation, developed to address the limitations of scalar or pairwise judgments in reward modeling for reinforcement learning from human feedback (RLHF). By providing a corpus of structured, multi-dimensional rubrics aligned to various prompt domains, OpenRubrics enables the training and evaluation of rubric-based reward models, with demonstrable gains in alignment and downstream policy optimization for LLMs (Liu et al., 9 Oct 2025).
1. Dataset Composition and Statistics
OpenRubrics comprises 53,398 unique prompt–rubric pairs. The dataset’s domain distribution is as follows:
| Domain | # Prompt–Rubric Pairs | Percentage |
|---|---|---|
| Instruction-following | 23,184 | 43.4% |
| Biomedical | 12,817 | 24.0% |
| Open-domain QA | 10,560 | 19.8% |
| Coding/Math | 6,837 | 12.8% |
Rubrics are decomposed into two types:
- Hard rules: Explicit, often verifiable constraints (e.g., “The response must use only information present in the passage”).
- Principles: Implicit, qualitative criteria (e.g., “The response should demonstrate clarity and logical structure”).
Each rubric specifies between 3 and 8 dimensions (mean = 5.2 per rubric), with domains such as biomedical and coding exhibiting more multi-faceted rubrics than general instruction-following tasks.
2. Data Format and Access
All OpenRubrics data is released in JSON format. Each sample consists of a prompt field, a rubric list, and associated metadata. The file structure is standardized as follows:
1
- JSON fields:
prompt,domain,rubric(list of objects withtypeanddescription),metadata(containingrubric_dimensions,source,preference_label_consistency). - Repository & access: The dataset and related code are available under the CC BY 4.0 License at https://github.com/OpenRubrics/OpenRubrics and https://huggingface.co/datasets/OpenRubrics/OpenRubrics.
- Licensing: CC BY 4.0 (academic and commercial use; attribution required).
3. Generation Methodology
OpenRubrics employs the Contrastive Rubric Generation (CRG) framework. CRG decomposes rubric synthesis into the following steps:
- Response Contrasting: For each prompt, a preferred and a rejected response are selected (using preexisting preference datasets or synthetic LLM outputs).
- Component Extraction: CRG prompts an LLM to analyze the contrast between preferred and rejected responses to explicitly enumerate both:
- Hard rules (directly violated in the rejected response but satisfied in the preferred one)
- Principles (qualities more subtly present or absent, e.g., relevance, informativeness)
- Rubric Assembly: The set of hard rules and principles is combined to form a comprehensive multi-dimensional rubric for the given prompt.
The CRG loss function is defined as:
where is the prompt, / are the preferred/rejected responses, / represent rubric satisfaction on dimension , and is the number of rubric dimensions per pair.
4. Quality Control and Reliability
To maximize the reliability of rubrics and prevent alignment drift or ambiguity, OpenRubrics employs a preference-label consistency framework:
- Preference-Label Consistency: For each generated rubric, agreement is measured between the relative scorings of preferred and rejected responses. This is formalized as:
where is the rubric-based score function, and 0 is the number of evaluated prompt–response pairs.
- Rejection Sampling: Rubrics exhibiting preference-label consistency under a threshold (set at 95%) are filtered out, enforcing that the induced scoring function reliably prefers designated preferred responses.
This results in a measured label consistency of 98.2% across the dataset, with detailed rejection logs released as part of the metadata.
5. Benchmarking and Applications
OpenRubrics is used to train Rubric-RM, a rubric-based reward model designed for reward modeling and LLM alignment. The following are key results:
- Reward-Modeling Benchmarks: Rubric-RM trained on OpenRubrics demonstrates a mean improvement of 6.8% over strong size-matched baselines on standard preference and reward modeling benchmarks.
- Alignment Improvement: Rubric-based signals enable model alignment with nuanced human-like standards, outperforming scalar judgment regimes.
- Transfer to Policy Models: Rubric-RM is incorporated for policy fine-tuning by using rubric-derived rewards in reinforcement learning protocols.
Performance transfer results include:
- Instruction-Following Evaluation: On instruction-following benchmarks, Rubric-RM aligned models achieve a 4.2% absolute gain in human agreement metrics over scalar reward modeling baselines.
- Biomedical QA: In biomedical answer generation, rubric-trained models register a 9.5% incremental improvement in F1-based utility, evidencing rubric-derived rewards' efficacy in specialized domains.
6. Example Entries
Representative prompt–rubric pairs from OpenRubrics illustrate the diversity and granularity of alignment signals:
Example 1: (Instruction-Following) 2
Example 2: (Biomedical) 3
Example 3: (Coding/Math) 4
References
- "OpenRubrics: Towards Scalable Synthetic Rubric Generation for Reward Modeling and LLM Alignment" (Liu et al., 9 Oct 2025)