PRIMAD-LID: Enhancing Reproducibility

Updated 14 June 2026

PRIMAD-LID Extension is a framework that enhances computational reproducibility by augmenting the original six PRIMAD dimensions with Lifespan, Interpretation, and Depth modifiers.
It systematically documents key factors including the execution platform, research objectives, implementation, methods, actors, and data with precise temporal and interpretative metadata.
The framework supports robust cross-disciplinary reproducibility audits and targeted diagnostic practices by standardizing metadata recording and validation workflows.

The PRIMAD-LID Extension defines an integrated and discipline-diagnostic framework for computational reproducibility by systematically augmenting the original PRIMAD model’s six core dimensions—Platform, Research objective, Implementation, Methods, Actors, and Data—with three cross-cutting modifiers: Lifespan, Interpretation, and Depth. This nine-facet structure formalizes all factors required to achieve, evaluate, and document reproducibility in computational research, enabling unambiguous specification, targeted diagnosis, and robust cross-disciplinary application (Aloqalaa et al., 5 Jan 2026).

1. PRIMAD: The Six Core Dimensions

The foundational PRIMAD framework addresses longstanding terminology ambiguity by identifying six variables whose control or variation must be stated in any reproducibility attempt:

P (Platform): The execution environment, covering hardware architecture, operating system, libraries, compilers, virtual machines, and containerization.
R (Research objective): The specific scientific question or goal; e.g., tumor image classification.
I (Implementation): Codebases, scripts, executables, or pipeline definitions operationalizing the method.
M (Methods): Abstract algorithms or methodological protocols, e.g., “random forest with cross-validation,” not tied to code instantiation.
A (Actors): The individuals or teams engaging with the experiment—developers, annotators, experimenters.
D (Data): All input datasets, configuration parameters, and any data preprocessing transformations.

The PRIMAD formalism operationalizes reproducibility: for a given study, one specifies which components are held constant and which are varied—e.g., “holding P and I fixed, but varying M to perform method-agnostic validation.” This structure clarifies the definitions of repeatability, replicability, portability, and robustness in computational science.

2. The LID Modifiers: Lifespan, Interpretation, Depth

The PRIMAD-LID extension systematically augments each PRIMAD component with three modifiers:

2.1 Lifespan (L)

Lifespan qualifies each artifact temporally, recording creation date ( $t_0$ ), modification history ( $t_\text{mod}$ ), last access ( $t_\text{access}$ ), and predicted end-of-life ( $t_\text{obsolete}$ ):

$\mathrm{Lifespan}_L(X) = \left\{ t_0(X),\, t_\text{mod}(X),\, t_\text{access}(X),\, t_\text{obsolete}(X) \right\}, \quad X \in \{P,R,I,M,A,D\}$

This enables temporal auditability and supports long-term usability assessments; research artifacts become effectively “expired” unless Lifespan is actively managed.

2.2 Interpretation ( $I_i$ )

Interpretation captures the reasoning, heuristic, or contextual logic that mediates between raw numerical outputs and scientific conclusions:

$I_i(D)$ may include statistical tests, visualization standards, or significance thresholds.
$I_i(M)$ documents rationale for algorithm selection or empirical parameter search strategies.

This metadata decouples the interpretive layer, making scientific insight itself subject to reproducibility scrutiny.

2.3 Depth ( $D_t$ )

Depth denotes the required granularity of artifact description, parameterized by a field-specific attribute vector:

$\mathrm{Depth}_{D_t}(X) = x \mapsto \{\mathrm{version}\cdot x,\, \mathrm{provenance}\cdot x,\, \mathrm{resource\_requirements}\cdot x,\, \mathrm{licence}\cdot x,\, \mathrm{metadata\_schema}\cdot x,\,\ldots\}$

The number and type of required attributes is context-dependent; for example, bioinformatics pipelines standardly require ten distinct metadata fields, while ML experiments may operate with a distinct checklist. Depth formalization increases comparability and completeness while allowing adaptation to community standards (e.g., FAIR principles).

3. Unified Conceptual Structure

The PRIMAD-LID framework is structured as a $t_\text{mod}$ 0 matrix: rows represent the LID modifiers (Lifespan, Interpretation, Depth); columns represent the PRIMAD dimensions (P, R, I, M, A, D). Each cell specifies the metadata and procedural requirements for the corresponding artifact-factor pair.

PRIMAD\LID	Lifespan	Interpretation	Depth
Platform (P)	Timestamps, modification and version history	Reason for platform choice, scalability explanation	Container digests, base image versions, resource needs
Research (R)	Research start and update times	Hypothesis framing, statistical test rationale	Precise statements, documentation completeness
Implementation(I)	Code commit dates, build environments	Justification of coding choices, optimization explanation	Source version, dependencies, source/compiled mapping
Methods (M)	Protocol versioning, updates	Algorithm selection rationale, parameter tuning methods	Algorithmic parameters, workflow schemas
Actors (A)	Team membership, access/control records	Decision logs, annotation/contribution standards	Roles, background, contribution specifications
Data (D)	Acquisition, preprocessing history	Data cleaning choices, statistical thresholds	Checksums, schemas, provenance, licences

All 18 (6×3) cells supply the composite foundation for computational reproducibility; Figure 1 in (Aloqalaa et al., 5 Jan 2026) visually represents these dependencies.

4. Application Scenarios

Example 1: High-throughput Sequencing (HTS) Pipeline

Platform: Nextflow 20.10.0 on Ubuntu 18.04 container.
- Lifespan: $t_\text{mod}$ 1
- Interpretation: Nextflow chosen for scalability
- Depth: Container digest, base image version, resource limits
Data: Raw FASTQ (v1.2), reference genome build 38.
- Lifespan: Data acquisition and update timeline
- Interpretation: Justification for sequence trimming thresholds
- Depth: Checksums, sample schema

Controlling these facets enabled persistent 98 % workflow “wholeness” despite major platform updates and over long temporal horizons.

Example 2: Cross-platform IR Portability

Platform: Lucene 8.5 on Ubuntu 20.04 vs. Windows 10
- Lifespan: OS and Java runtime patch tracking
- Interpretation: File-system differences as I/O confounders
- Depth: Java version, heap-size, path separator specifics

Systematic documentation across all nine factors localizes variation sources, separating genuine platform effects from methodological inconsistencies.

Example 3: Method-Independent Validation

Holding Research objective, Actors, and Data constant while swapping Methods (random forest $t_\text{mod}$ 2 XGBoost), and documenting Lifespan, Depth, and Interpretation for both. Consistent results under controlled method variance establish reproducibility at the Interpretation layer, supporting methodological robustness.

5. Formal PRIMAD-LID Reproducibility Predicate

PRIMAD-LID recasts reproducibility as a formal predicate:

$t_\text{mod}$ 3

or more explicitly,

$t_\text{mod}$ 4

where $t_\text{mod}$ 5 is an assessment of consistency, transparency, and coverage given the fixed and varied dimensions of a specific reproducibility study.

6. Guidelines and Best Practices

Authors and reviewers are advised to:

Version all artifacts: Employ commit hashes, container tags, dataset DOIs, or checksums.
Explicitly record Lifespan metadata for all components.
Publish Interpretation: Document decision rationales, hypothesis tests, and analysis conventions.
Parameterize Depth: Adopt domain-relevant schemas and specify resource profiles, provenance chains, licences.
Use open, persistent repositories (e.g., Zenodo, Figshare) for all artifacts.
Modularize components: Decouple data ingress, analysis, and reporting to increase maintainability.
Apply environment control (e.g., CI/CD pipelines) for proactive reproducibility assurance.
Vary dimensions empirically to test reproducibility coverage.
Provide an explicit mapping of which of the nine facets are held constant or varied in every replication attempt.

Consistent application of these best practices operationalizes PRIMAD-LID as both a planning mechanism and an audit checklist, supporting discipline-diagnostic reproducibility coverage and transparent evaluation for computational studies (Aloqalaa et al., 5 Jan 2026).

Markdown Report Issue Upgrade to Chat

References (1)

PRIMAD-LID: A Developed Framework for Computational Reproducibility (2026)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to PRIMAD-LID Extension.