sGraph Verification in Tensor Programs
- sGraph verification is a formal method that symbolically encodes tensor programs to ensure all concrete instantiations remain functionally equivalent.
- It applies symbolic expression analysis, algebraic rewriting, and e-graph equivalence for validating complex GPU kernel and parallel operation hierarchies.
- By enforcing algebraic and hardware constraints, sGraph verification enables superoptimization, resulting in notable speedups for tensor computations.
sGraph Verification
An sGraph is a symbolic, hierarchical representation of a family of tensor programs, used to compactly encode and reason about program equivalences and optimizations by symbolically representing certain execution parameters, program structure, and parallelization strategies. sGraph verification refers to the formal process of proving that all concrete instantiations of a given sGraph are functionally equivalent to a target tensor computation under a specified set of algebraic identities and hardware constraints. Conceptualized in the Prism/SIGMA superoptimization framework, sGraph verification leverages symbolic expression analysis, algebraic rewriting, and e-graph–based equivalence checking to guarantee the semantic correctness of tensor program transformations (Wu et al., 16 Apr 2026).
1. sGraph Definition and Structure
An sGraph is defined as the tuple: where:
- : Set of hierarchical nodes, each representing a kernel-level, block-graph, or thread-graph operator.
- : Directed acyclic edge set capturing data-flow dependencies.
- : Symbolic parallelization parameters (grid and loop dimensions).
- : Boolean mapping variables, specifying if data dimension of tensor is partitioned along parallel dimension .
- : A mapping that assigns each hierarchical operator node to a finer-grained sGraph, preserving the GPU kernel/block/thread hierarchy.
Partitioning constraints enforce that for each 0, each 1 partitions at most one 2 and each 3 is partitioned at most once w.r.t. grid dimensions. An sGraph thus symbolically captures entire classes of program implementations sharing common structure and parallelization schemes (Wu et al., 16 Apr 2026).
2. Symbolic Semantics and Mapping Variables
sGraphs define families of tensor programs that depend symbolically on mapping variables (4) and parallel dimension sizes (5). For each data dimension 6 of tensor 7, the “local” size in a given subgraph equals: 8 where 9 is the global extent. These symbolic shape transformations mechanically propagate through the hierarchical structure via the expand field.
Algebraic constraints, such as
0
are collected to ensure shape compatibility for operators like matmul, enforcing valid parallel decompositions.
3. Verification via E-Graph Equivalence
For a fixed assignment 1 (concrete mapping of partition variables), each sGraph induces a deterministic tensor expression 2. The verification problem is to check whether: 3 where 4 is the reference (e.g., user-specified) computation.
sGraph verification is performed using equivalence graphs (e-graphs), such as those implemented in the egg library. The expression language is augmented with special operators for partitioning (5), reduction (6), combination (7), and replication (8), and key tensor operations (matmul, add, etc.). A rich set of rewrite rules formalize algebraic identities, including:
- Associativity, distributivity of matmul
- Commutation and cancellation properties of parallel operators
- Parallel matmul reductions (e.g., 9)
If the e-graph unifies 0 and 1 into one equivalence class after saturation, the candidate mapping is deemed correct (Wu et al., 16 Apr 2026).
4. Pruning: Algebraic and Hardware Constraints
sGraph verification is coupled with structured pruning:
- Symbolic dimension matching: Linear equalities over mapping variables are enforced to ensure that all partitionings and reductions are shape compatible.
- Expression-guided pruning: At 2, the sGraph degenerates to a scalar-dataflow DAG; checks are performed that every intermediate computation must be structurally encountered in the root expression, cheaply eliminating degenerate sGraphs.
- Hardware constraints: Resource limitations (e.g., shared memory) further restrict allowed values for 3, ensuring generated sGraphs are hardware-feasible.
This logic allows the superoptimizer to consider only structurally valid and concrete-equivalent program representations.
5. Correctness Criteria and Soundness
Correct mapping: An assignment 4 is correct if, for all parallel dimension choices 5, the corresponding sGraph program yields the same functional result as the reference computation: 6 Feasible sGraph: An sGraph is feasible if some correct mapping exists.
Soundness of the verification procedure is conditional on the soundness of the employed algebraic axioms: if the e-graph deems the sGraph correct, all corresponding concrete programs are semantically equivalent for all allowed dimension assignments. Completeness (catching all possible correct mappings) is not guaranteed (Wu et al., 16 Apr 2026).
6. Example: Fused Softmax-MatMul Verification
Prism demonstrates verification with a fused softmax–matmul block. The candidate sGraph, under suitable mapping choices, can be algebraically reduced using successive e-graph rewrites (pushing partitions into 7, rewriting reductions, commuting combination operators) to the original computation: 8 Matching against the optimized sGraph expression after rewrites yields a verified equivalence.
7. Practical Impact and Scope
Evaluation on large LLM workloads shows that sGraph-based superoptimization achieves 9–0 speedup over state-of-the-art, with optimization times reduced by up to 1. sGraph verification thus enables exhaustive yet scalable exploration of high-performance tensor kernel implementations in real-world systems, bridging formal functional verification and practical superoptimization for production workloads (Wu et al., 16 Apr 2026).