Partial Representative Execution (PREX)

Updated 10 January 2026

PREX is a methodological paradigm that executes representative subsets of behaviors to capture essential correctness or coverage properties in incomplete or nondeterministic systems.
It integrates static program analysis, decision point injection, and runtime scheduling to prune redundant executions while ensuring accurate behavior reproduction.
PREX is applied across model-driven engineering, concurrency debugging, and GPU kernel fuzzing to accelerate testing and debugging with controlled approximation guarantees.

Partial Representative Execution (PREX) is a general methodological paradigm and supporting set of technical approaches for enabling runtime execution, analysis, or testing of systems that are incomplete, large, or highly nondeterministic, by executing only a representative subset of behaviors, prefixes, or slices, yet preserving essential correctness or coverage properties. Recent literature demonstrates that PREX unifies techniques across model-driven engineering (Bagherzadeh et al., 2021), concurrency debugging (González-Abril et al., 2021), and high-throughput program analysis (particularly in heterogeneous CPU/GPU environments) (Singh et al., 3 Jan 2026). Implementations exploit static program structure, operational semantics, or affine data access patterns to justify aggressive pruning or injection of decision points, while offering either correctness preservation or controlled approximation guarantees.

1. Conceptual Foundations of Partial Representative Execution

PREX formalizes and automates the process of executing incomplete, partial, or pruned subsets of system behaviors, with two high-level goals: (a) enabling analysis or debugging of incomplete artifacts without finalizing full semantics, and (b) reducing computational effort by restricting execution to those fragments ("representative executions") that are sufficient to capture relevant correctness or error-detection properties.

In model execution, PREX guarantees that any behavior possible in the original partial artifact remains expressible, while introducing no additional behavior except for engineered mechanisms to resolve incompleteness via runtime or scripted decision-making (Bagherzadeh et al., 2021). In concurrent system tracing or replay, PREX instruments executions to precisely follow an arbitrary prefix of a recorded trace, then transitions to free execution, thus decoupling the initial reproducibility window from the tail of execution (González-Abril et al., 2021). In the context of large-scale fuzzing of GPGPU kernels, PREX relies on affine-access analysis to replace O( $B\times T$ ) thread executions by a handful of boundary "representative" threads, sufficient to surface all bugs detectable in the class under analysis (Singh et al., 3 Jan 2026).

2. PREX in Model-Driven Engineering: Execution of Partial State Machines

PREX was introduced for execution of partial UML-RT state machines to solve the limitation that ordinary simulators and code generators only handle fully realized models (Bagherzadeh et al., 2021). The key components are:

Static Analysis: The model is checked for "stuck" configurations (P₁–P₁₁), including missing initial states, broken transition chains, deadlock states, and unhandled events.
Automatic Refinement: Where the model is incomplete, PREX inserts decision points (synthetic choice-states) and stub transitions, so at runtime the execution never deadlocks. These may be guarded or refined using user input or scripts.
Input-Driven Execution: At each encountered decision point, the user (or a test harness) supplies the next move, either interactively or driven by scripted "execution rules."

The transformation preserves all behaviors of the original model (as shown formally by simulation preorder) and adds only the flexibility to "fill in" missing behavior dynamically, ensuring that the refined system simulates the original. Performance evaluations on UML-RT models show transformation and execution overheads are moderate. The PMExec engine provides a reference implementation.

3. Tracing and Replay: PREX for Asynchronous Message-Passing Concurrency

González-Abril and Vidal (González-Abril et al., 2021) generalize the classic tracing and replay protocol for concurrent message-passing programs using PREX. The approach—labeled "prefix-based tracing"—is operationalized as follows:

Partial Execution Trace (Prefix): Given a partial trace $\Pi$ , representing the first $k$ actions in each process of a concurrent system, PREX instruments the program such that execution is forced to exactly replay $\Pi$ via a central scheduler.
Switch to Free Execution: Once $\Pi$ is fully consumed (i.e., all per-process prefixes are replayed), the program proceeds nondeterministically, operating under standard message-passing semantics. All actions from this point are traced.
Instrumented Scheduler: Every primitive (spawn, send, receive) is routed through a scheduler that, in replay mode, enforces the partial prefix and, in trace mode, simply records actions.
Correctness Guarantee: The resulting complete trace $\tau$ always satisfies $\Pi\sqsubseteq\tau$ , and when $\Pi$ is a full execution, the run is fully deterministic.

Both pure tracing and full replay are recovered as edge cases (empty or full $\Pi$ ). This mechanism facilitates controlled state-space exploration and debugging beyond pure record-and-replay, allowing systematic examination of execution alternatives that diverge after a prescribed prefix.

4. PREX in High-Throughput Program Analysis: GPU Kernel Fuzzing

In GPU program fuzzing, naive approaches require executing all $B\times T$ threads for every input, leading to prohibitive computational cost when analyzing large CUDA kernels on CPUs (Singh et al., 3 Jan 2026). PREX, as implemented in CuFuzz, leverages the following principles:

Affine Access Analysis: Many kernels access memory via affine functions of thread/block indices. PREX statically checks for this pattern using compiler IR analysis.
Boundary Thread Theorem: For affine-access kernels, if a memory-safety bug exists, it must manifest on one of four "corner" threads (e.g., (0,0), (0, $B-1$ ), ( $T-1$ ,0), ( $T-1$ , $B-1$ )).
Selective Execution: Only these boundary threads (or, in more general cases, head/tail blocks) are executed per fuzz input. Full execution is used only for kernels that fail the affine-access check.
Runtime Scheduling and Coverage: Execution is monitored for coverage and bug discovery using integration with AddressSanitizer and AFL-like coverage tracing. PREX iteratively expands coverage only if necessary.

Empirical results show PREX yields an average 32 $\times$ throughput speedup (up to 183 $\times$ ), making previously infeasible campaigns tractable. In the largest cases, only four thread-instances are needed per input, pruning $\geq$ 99.9985% of redundant executions. Limitations include fallback to full execution for kernels with complex or data-dependent access patterns.

5. Correctness, Guarantees, and Theoretical Underpinnings

Each PREX instantiation relies on structural properties to justify soundness or preservation.

Model Execution: The refinement ensures that all reachable behaviors are preserved (via simulation preorder $L_o\preceq L_r$ ) and that only the minimum necessary rescue transitions are introduced to address partiality (Bagherzadeh et al., 2021). User or script picks at each decision point can recover any original trace; sticking to original transitions yields behaviorally equivalent runs.
Prefix-Based Concurrency Replay: The correctness property ensures that the replay prefix $\Pi$ is always embedded as a prefix in every executed trace.
GPU Fuzzing: Affine analysis and boundary theorems guarantee that all memory bugs surfaced by any thread must also surface on boundary threads. If a bug goes undetected with boundary threads, it does not exist in affine-access patterns.

In each domain, PREX leverages domain-specific structure (state-machine topology, message-passing schedules, memory-access patterns) to justify aggressive execution pruning or dynamic gap-filling while retaining coverage or soundness guarantees.

6. Practical Implementation, Performance, and Limitations

Representative implementations of PREX techniques span external tools, compiler passes, and runtime libraries, exemplified by PMExec (Bagherzadeh et al., 2021), prefix-based Erlang instrumentation (González-Abril et al., 2021), and CuFuzz (Singh et al., 3 Jan 2026).

Performance and complexity analyses report that static analysis and automatic transformation overheads are moderate (sublinear in model size or proportional to the size of the partial prefix), and runtime decision-point resolution can be highly efficient (per-decision selection $<1\,\text{ms}$ for $10^4$ rules). In the GPU context, PREX achieves a geometric mean acceleration of $32\times$ across benchmarks, but falls back to baseline in the presence of non-affine or highly data-dependent accesses.

A table summarizing PREX settings is given below:

Domain	PREX Mechanism	Principal Guarantee
State Machines	Model refinement, decision pts	Simulation, deadlock-freedom
Concurrency	Prefix replay and trace	Prefix preservation ( $\sqsubseteq$ )
GPU Fuzzing	Affine thread pruning	Bug surfacing via representatives

Key limitations include reliance on recognizable structure (e.g., affine accesses, explicit state-charts), inability to optimize obscured or highly dynamic dependencies, and potential bottlenecks from central coordination (as in the Erlang scheduler). Future work is suggested in broadening the class of representable behaviors—for example, via dynamic clustering, richer static analysis, or ML-guided representative selection (Singh et al., 3 Jan 2026).

7. Synthesis and Research Trajectory

Across application domains, PREX provides a unified abstraction for addressing the twin challenges of partiality (incompleteness, early-stage design) and scalability (reduction of redundant or non-contributory computation). The adoption in model-driven engineering, concurrent program debugging, and high-throughput fuzz testing illustrates the versatility of the paradigm.

Current research directions include formalization of step semantics and liveness/safety properties for prefix-based replay (González-Abril et al., 2021), empirical scaling with larger or more dynamic models (Bagherzadeh et al., 2021), and generalization of representative selection to handle irregular control or data flows in GPU kernels (Singh et al., 3 Jan 2026). The orchestration of static analysis, runtime adaptation, and correctness preservation constitutes the defining methodological signature of PREX across these efforts.