sGraph: Hierarchical Symbolic & Sparse Graphs

Updated 3 July 2026

sGraph is a formalism that defines hierarchical, symbolic, and sparse graph representations used for optimizing tensor programs, sketch analysis, and solar image compression.
It employs structured DAGs and symbolic parameterization to enable early pruning, efficient autotuning, and correctness verification in complex computations.
Its modular applications in multi-level sketch learning and spectral-spatial compression demonstrate measurable improvements in speed, accuracy, and resource efficiency.

The term "sGraph" appears in several distinct research contexts across recent literature, most notably as a symbolic, hierarchical graph representation for tensor program optimization (Wu et al., 16 Apr 2026), a stroke-level sparse graph module for multi-level sketch learning (Cheng et al., 14 Oct 2025), and as part of a spectral-spatial graph learning framework in solar image compression (Siwakoti et al., 30 Dec 2025). Each context formalizes "sGraph" for a specialized purpose, contributing unique algorithmic and theoretical advances centered around sparse, structured, or symbolic graph representations.

1. Symbolic sGraph for Tensor Program Superoptimization

An sGraph, as introduced in "Prism: Symbolic Superoptimization of Tensor Programs," is a hierarchical directed acyclic graph (DAG) parameterized symbolically to encode entire families of tensor program implementations compactly (Wu et al., 16 Apr 2026). This abstraction enables tractable formal reasoning about correctness, parallelism, and dataflow for GPU and other parallel hardware, and unifies the program search space through symbolic variables for grid sizes, loop dimensions, and tensor mappings. An sGraph generalizes the μGraph of Mirage by retaining all parallelization parameters (grid and for-loop dimensions), as well as tensor-dimension-to-parallel-dimension mappings, in symbolic form.

Formally, an sGraph $S$ consists of:

$\mathbb{K}$ : kernel-graph (DAG of high-level tensor ops)
$\mathbb{B}$ : block-graphs attached to kernel nodes, may nest thread-graphs
$\mathcal{P} = \mathcal{P}_g \cup \mathcal{P}_f$ : set of parallelization (grid and loop) dimensions, symbolic
$\mathbb{D}$ : symbolic variables $d_p$ for sizes of each $p \in \mathcal P$
$\mathbb{M}$ : mappings $m_{T,d,p} \in \{0,1\}$ —binary variables indicating partitioning of tensor/data-dimension $(T,d)$ by parallel dimension $\mathbb{K}$ 0
$\mathbb{K}$ 1: directed dataflow edges, shape-annotated
$\mathbb{K}$ 2: correctness/equivalence constraints

The sGraph representation allows for the definition and instantiation of a (typically exponential in $\mathbb{K}$ 3) family of concrete programs via assignment to the symbolic variables $\mathbb{K}$ 4 and $\mathbb{K}$ 5, subject to shape-compatibility, hardware, and correctness constraints. This forms the basis for two-level search: symbolic pruning is performed over sGraphs, while concrete instantiations undergo autotuning and cost profiling.

Characteristics of sGraph formalism include:

Hierarchical kernel-block-thread composition
Full symbolic parameterization: all parallel and mapping parameters are left explicit as variables, not constants
Early symbolic pruning of infeasible design routes (by constraint satisfaction and shape-matching)
Correctness ensured by e-graph (egg) rewriting over a rich set of tensor-algebra and parallelization identities
Final selection via autotuning, after instantiation to concrete hardware parameters (grid/block size), yielding optimized low-level code

This approach delivers both performance (up to $\mathbb{K}$ 6 kernel speedup, $\mathbb{K}$ 7 search reduction over baselines) and formal equivalence guarantees in modern ML accelerator settings (Wu et al., 16 Apr 2026).

2. sGraph Module in Multi-Level Sketch Representation Learning

In the context of SDGraph for sketch analysis, sGraph specifically denotes the sparse graph module operating at the stroke level, as detailed in "SDGraph: Multi-Level Sketch Representation Learning by Sparse-Dense Graph Architecture" (Cheng et al., 14 Oct 2025). The sGraph module is designed to extract and encode information at coarser semantic granularity than point-level graphs, operating with the following key properties:

Nodes: one per stroke in an input sketch; typically, each stroke is a sequence of 2D points.
Node Features: obtained by applying a shared 1D convolutional neural network and max-pooling over the point sequence of each stroke, resulting in a fixed-length embedding for each stroke.
Edges: constructed based on $\mathbb{K}$ 8-nearest-neighbor relationships between stroke centroids (in 2D space or feature space). Adjacency is symmetric.
Graph Convolution: employs EdgeConv (as in DGCNN), with per-edge MLPs processing the pairwise difference between embeddings, followed by max-aggregation over neighbors.
Pooling (S-Down): uses farthest-point sampling among stroke centroids to select representative nodes, with neighbor aggregation via MLPs for coarsening.
Unpooling (S-Up): interpolates from the coarsened set back to the original node set via distance-weighted combinations from the nearest pool centers, followed by MLP refinements and optional skip connections.
Integration: sGraph modules are stacked as encoder blocks, acting in concert with dense (point-level) graphs, and serve as decoders in generation tasks.

The sGraph module is not separately supervised but participates in end-to-end training losses for sketch classification (cross-entropy), retrieval (triplet loss), and generation (diffusion MSE), providing both stroke-level and sketch-level feature learning. Its architectural efficiency for multi-granular sketch processing underpins the observed improvement in state-of-the-art performance on sketch recognition and generation (Cheng et al., 14 Oct 2025).

3. sGraph in Multispectral Solar Image Compression

"sGraph" also refers to the hybrid framework for learning to compress multispectral solar imagery in "Spectral and Spatial Graph Learning for Multispectral Solar Image Compression" (Siwakoti et al., 30 Dec 2025). Here, the pipeline incorporates explicit spectral and spatial graph modules as key architectural primitives:

Inter-Spectral Windowed Graph Embedding (iSWGE):
- Models inter-band relationships by treating each spectral group (set of channels) within a spatial window as a node; edges are constructed in a static cyclic topology.
- Node and edge embeddings are learned by MLPs, and message-passing is conducted over three layers using Laplacian-normalized adjacency and CensNet-style co-embedding updates.
- This models the spectral dependencies necessary to maintain high-fidelity reconstruction across EUV channels.
Windowed Spatial Graph Attention + Convolutional Block Attention (WSGA-C):
- Applies sparse, dynamic $\mathbb{K}$ 9-NN graph attention within spatial windows, aggregating features from graph neighborhood with learned attention weights.
- A CBAM (Convolutional Block Attention Module) branch further supplies channel and spatial gating.
- The combined features are concatenated and projected, focusing the network on salient spatial structures.

The sGraph pipeline achieves superior compression metrics (up to 1.09% PSNR and 1.62 dB MS-SSIM gain, 20.15% reduction in Mean Spectral Information Divergence) compared to CNN and conventional graph-attention baselines, with minimal runtime overhead (Siwakoti et al., 30 Dec 2025).

4. Algorithmic and Mathematical Foundations

Regardless of context, sGraph formalizations share a focus on exploiting structural sparsity, parameterized representations, and efficient message-passing or reasoning algorithms:

In symbolic tensor optimization, all block, grid, and data-mapping operations are formalized as symbolic variables subject to explicit constraints, allowing compact representation and algebraic manipulation over large program classes.
In sketch learning, the stroke-level sGraph module is built with explicit neighbor selection and permutation-invariance, ensuring robustness to variable graph sizes and topologies.
Compression frameworks model spatial and spectral relationships explicitly using windowed graph operations, combining MLP-based aggregation with attention and co-embedding message passing.

A summary table highlights the different sGraph instantiations:

Context	Node Definition	Edge Construction	Key Application
Tensor program superoptimization (Wu et al., 16 Apr 2026)	Kernel/block/thread ops (symbolic)	Data-flow, parameterized	Correctness-guaranteed GPU kernels
Multi-level sketch learning (Cheng et al., 14 Oct 2025)	Strokes in a sketch	KNN in stroke centroid	Stroke-level embedding, recognition
Solar image compression (Siwakoti et al., 30 Dec 2025)	Spectral groups per spatial window	Cyclic/connectivity	MSID-optimized multi-channel compression

The sGraph framework in symbolic superoptimization departs from traditional computational graphs by:

Encoding not only operation sequencing but also symbolic parallelization strategies and dimension mappings, rather than fixed schedules or layouts.
Supporting compact representation of exponentially many candidate implementations, enabling early pruning and equivalence verification at the symbolic level rather than per-concrete instance.
Employing e-graph rewriting for program equivalence, subsuming rule-based and algebraic transformations as a verification substrate.

In the sketch learning and compression contexts, sGraph modules are distinctive in their use of high-level, interpretable node definitions (strokes or spectral windows), explicit yet flexible sparse connectivity, and deep integration with coarser/finer graph representations or other convolutional/attention-based modules.

6. Empirical Impact and Availability

Evaluation across specialized tasks demonstrates that sGraph-based architectures deliver both efficiency and representational power:

In tensor program optimization, Prism's sGraph representation enables up to $\mathbb{B}$ 0 faster kernels and $\mathbb{B}$ 1 search reduction compared to the best code superoptimizers (Wu et al., 16 Apr 2026).
In sketch analysis, the sGraph module contributes to 1.15% classification and 1.7% retrieval gains over the state-of-the-art, and achieves 36.58% improvement in vector sketch generation metrics (Cheng et al., 14 Oct 2025).
In multispectral solar image compression, sGraph yields state-of-the-art MSID, PSNR, and perceptual (MS-SSIM) performance with only moderate training and inference overhead (Siwakoti et al., 30 Dec 2025).

The sGraph codebase for solar image compression is publicly available at https://github.com/agyat4/sgraph, supporting reproducibility and further research (Siwakoti et al., 30 Dec 2025).

7. Outlook and Extensions

The sGraph formalism, in its various incarnations, establishes a rigorous bridge between symbolic reasoning, efficient sparse structure exploitation, and large-scale deep learning or optimization domains. Possible extensions include:

Extension of symbolic sGraph search and pruning to more general classes of operator fusion, scheduling, and device placement problems.
Integration of sGraph-based modules for multimodal representation learning beyond sketches, such as audio or structured document domains.
Application in real-time or streaming settings, leveraging the efficiency of symbolic or stroke-level sparse graphs.

A plausible implication is that continued generalization and application of sGraph principles across computational and learning domains will lead to new algorithms that balance tractability, expressivity, and hardware-aware performance guarantees.