ACM Pre-Publication Verification Badges

Updated 13 January 2026

The paper establishes a systematic framework for empirically auditing scientific artifacts before publication.
ACM badges define explicit criteria for artifact availability, functional evaluation, and result reproduction using quantitative metrics.
The verification workflow includes environment setup, dependency management, and rigorous side-by-side data comparisons.

The ACM Pre-Publication Verification Badges system is an institutionalized framework for empirically auditing scientific software artifacts prior to publication. Originating from the ACM Artifact Review and Badging policy, these badges encode standardized criteria for artifact availability, usability, and empirical reproducibility, playing a central role in contemporary software-engineering research integrity. The framework defines precise requirements for archival, functional evaluation, and quantitative result reproduction, and supports both pre-publication and, with recent extensions, post-publication verification workflows (Pellegrini, 2021, Lopez-Moreno et al., 12 Jan 2026).

1. Formal Definition and Criteria of ACM Pre-Publication Badges

The ACM pre-publication badge system comprises five distinct badges, with three focusing on artifact availability and usability, and two targeting independent validation of experimental results (Lopez-Moreno et al., 12 Jan 2026). Key pre-publication badges and their formal criteria, as exemplified in the QN-antipattern reproducibility report, include:

Artifacts Available (A1): Requires permanent archival of comprehensive research artifacts (source code, data, scripts, documentation) in a stable repository (e.g., Zenodo) with machine-readable licensing, usage instructions in README, and explicit metadata detailing dependencies.
Artifacts Evaluated – Functional (A2): Mandates installation and correct execution of the artifact by an independent reviewer in a controlled environment, with all dependencies satisfiable. At least one end-to-end “smoke test” or experiment must be reproducible, yielding meaningful output.
Results Reproduced (A3): Demands regeneration of all paper figures/tables using provided artifacts. Quantitative outputs (means, variances) must align with published values within formal statistical tolerances, typically $|μ̂–μ| ≤ ε·|μ|$ and $|σ̂²–σ²| ≤ ε·σ²$ for $\epsilon = 0.05$ (5%). Any discrepancies must fall within pre-established thresholds or confidence intervals and be fully explained.

The additional badges, Artifacts Evaluated – Reusable (exceptionally modular and adaptable artifacts) and Results Replicated (independent reimplementation confirming conclusions), round out the system, but are less commonly awarded in strictly pre-publication context.

2. Application of Criteria: Case Study and Quantitative Verification

In practice, awarding ACM badges depends on rigorous technical checks performed by independent reviewers. The reproducibility auditing of the QN-antipatterns artifact (Pellegrini, 2021) exemplifies application of the formal criteria:

The artifact, including Python scripts, Jupyter notebooks, metadata, and a stepwise README, was persistently archived on Zenodo with a DOI and explicit dependencies (numpy, pandas, matplotlib, JMT).
Installation and execution were performed on an Arch Linux environment (kernel 5.10.4, Python 3.9.1, JMT 1.0.4), with all precomputed results purged and complete re-running of experiments (six hours of JMT-based simulation). No execution errors were observed and end-to-end outputs were produced.
For results reproduction, numerical criteria were strictly enforced:
- Each figure's reproduced mean ( $μ̂$ ) and variance ( $σ̂²$ ) satisfied $|μ̂–μ| ≤ ε·|μ|$ and $|σ̂²–σ²| ≤ ε·σ²$ ( $ε = 0.05$ ).
- Table-specific metrics (e.g., utilization error $E_{util}$ , mean absolute percentage error for response time $E_{resp}$ ) were verified against original 99% confidence intervals (±0.41% for utilization, ±0.10 ms for response time), with all errors well below the 5% threshold.

This evidence supported awarding all three major pre-publication badges, with transparent records (installation logs, side-by-side graphical and tabular comparisons, confidence interval checks).

3. Verification Workflow: Processes and Protocols

The badge verification workflow consists of five discrete steps, emphasizing technical repeatability and systematic documentation:

Environment Setup: Provision an isolated Linux virtual machine with precise OS/kernel versioning, and install required language runtimes (Python, Jupyter) and simulation tools (JMT CLI confirmed in $PATH).
Dependency Management: Clone artifact repository, initialize virtual environment, pip-install runtime dependencies (numpy, pandas, matplotlib).
Artifact Upload Validation: Confirm repository DOI, verify archive checksum integrity, and cross-check README instructions against artifact contents.
Experiment Execution: Purge precomputed results, initiate main simulation scripts (e.g., run_simulations.py), monitor for runtime errors, and ensure full pipeline execution.
Data Regeneration and Comparison: Run post-processing notebooks to recreate figures/tables, export as PDFs, and conduct granular side-by-side validation with originals embedded in the reproducibility report.

Each stage generates concrete evidence captured in logs, exported artifacts, and comparison records, fulfilling audit requirements for badge assignment.

4. Outcomes, Evidence, and Metadata Representation

Concrete evidence supporting badge award includes:

Badge	Concrete Evidence	Repository/Link
Artifacts Available	Zenodo DOI, downloadable ZIP (~50 MB), comprehensive README	https://zenodo.org/record/4495665
Artifacts Evaluated‑Functional	Installation logs, smoke-test outputs, timings, error-free scripts	Provided in publication
Results Reproduced	Side-by-side figures/tables, error metric comparisons, console logs, confidence intervals	Publication supplement

For formal integration into scholarly metadata, journals employ structured schema (e.g., JSON-LD under Schema.org, JATS XML custom-meta blocks), recording verification badge type, verifier identity (ORCID), verification date, and direct repository URLs. This persistent embedding ensures open auditability and downstream indexer discoverability (Lopez-Moreno et al., 12 Jan 2026).

5. Post-Publication Badge Extension: Framework and Workflow

To extend verification credibility beyond initial review, (Lopez-Moreno et al., 12 Jan 2026) proposes formalizing post-publication badges—specifically limited to “Results Reproduced (Post-Pub)” and “Results Replicated (Post-Pub)”.

Any independent group can submit a verification report, comprising protocol documentation, hardware/software specs, scripts, and quantitative comparisons. An “attestation of independence” is mandatory.
Journal staff (preferably original reviewers) vet the report for completeness, evidence, and verifier independence. A paper can earn at most one badge of each type, corresponding to the first accepted post-publication verification.
Metadata is updated to reflect these badges, with persistent links to the verification repository, and badge provenance encoded in CrossRef deposits:

{
  "@type": "Badge",
  "badgeType": "ResultsReproduced",
  "verifier": "ORCID:0000-0002-XXXX-YYYY",
  "verificationDate": "2026-07-15",
  "repository": "https://github.com/verifier/repro-code"
}

This formalizes the post-publication empirical audit, anchoring reproducibility within journal infrastructure.

6. Community Impact, Benefits, Limitations, and Controversies

Benefits:

ACM badges incentivize meticulous artifact documentation and open, persistent archival. They enable independent, standardized empirical validation, promote trust via empirical audit, and facilitate long-term reuse by future researchers. Post-publication badges shift verification responsibilities beyond editorial staff, creating academic credit for verifiers and reducing duplication of effort on architectures already verified. They provide industry with reliable signals, aid in fraud deterrence, and support rigorous cumulative research in cyber-physical systems and related domains (Lopez-Moreno et al., 12 Jan 2026, Pellegrini, 2021).

Limitations:

Administrative burden for journals (badge tracking, vetting, metadata updates); partial reproducibility when data or dependencies are proprietary or hardware-specific; security risks from executing third-party code; possible misinformation propagation; and systemic inequalities (high-compute verifications dominated by large labs). Only the first successful post-publication report per badge type is recognized, potentially discouraging parallel efforts.

Alternative Perspectives:

Critics argue that pre-publication review and code availability suffice, or that verification should be mandatory only before publication. Others propose decentralized verification platforms (e.g., Papers with Code) over journal-controlled badges, or advocate prioritizing replications over simple reproductions. Concerns also exist about deepening inequality, with mitigation attempted via badge quantity limits and encouragement of verifications for less prominent works.

A plausible implication is that the ACM badge system, especially with its structured post-publication extension, is converging toward a global standard for empirical transparency and reliability in software-engineering research.