ONNX-based Gaussian Generator

Updated 1 January 2026

ONNX-based Gaussian Generator is a standardized module that bridges neural 3D Gaussian Splatting models with WebGPU rendering pipelines via a strict I/O contract.
It employs a per-frame, asynchronous TypeScript API and onnxruntime-web to efficiently convert ONNX model outputs into GPU-optimized tensors for real-time compositing.
This design decouples generator logic from rendering, enabling advanced neural processing with up to 135× speed improvements while maintaining high visual quality.

An ONNX-based Gaussian Generator is a standardized software module that binds ONNX-exported 3D Gaussian Splatting (3DGS) models to high-performance real-time rendering pipelines, specifically for deployment in WebGPU-enabled environments. This approach, exemplified by the Visionary platform, formalizes a "contract" enforcing consistent I/O conventions between neural 3DGS generators—ranging from classic to MLP-based, 4D-extended, and avatar-conditioned networks—and a browser-native renderer. It enables per-frame, plug-and-play neural processing, facilitating both reconstructive and generative rendering paradigms and allowing advanced feedforward post-processing directly on the client side (Gong et al., 9 Dec 2025).

1. Standardized Gaussian Generator Contract

The core feature of the ONNX-based Gaussian Generator is a strict contract-defined interface that decouples generator implementation from rendering. The interface mandates:

Per-frame input tensors encoding camera parameters (such as a 4×4 view-projection matrix), frame index, and optional control signals.
Output tensors containing packed per-Gaussian attributes: positions ( $\mu_i \in \mathbb{R}^3$ ), anisotropic covariances (upper-triangular part of $\Sigma_i \in \mathbb{R}^{3\times3}$ ), colors ( $c_i \in [0,1]^3$ ), and opacities ( $\alpha_i \in (0,1)$ ).

This contract is concretely instantiated through a concise TypeScript API using onnxruntime-web. The generator is initialized from an ONNX model URL and exposes an asynchronous .run() method, returning the requisite outputs as GPU-uploadable tensors. These contract specifications generalize across task variants (static 3DGS, dynamic 4DGS, avatars, etc.), enabling seamless switching or chaining of ONNX generators in a single WebGPU-powered rendering pipeline (Gong et al., 9 Dec 2025).

2. Mathematical Formalism of 3D Gaussian Splatting

The mathematical basis follows the canonical 3DGS definition. Each scene is represented by a set of $N$ Gaussian primitives:

$G_i = \{ \mu_i \in \mathbb{R}^3, \; \Sigma_i \in \mathbb{R}^{3 \times 3}, \; c_i \in [0,1]^3, \; \alpha_i \in (0,1) \}$

Spatial scale and orientation are encoded via a diagonal scaling and rotation quaternion: $\Sigma_i = R_i\,\operatorname{diag}(s_i^2) R_i^\top$ . At render time:

Gaussians centers are projected: $x_i = \Pi(\mu_i)$ using the $4{\times}4$ camera matrix $\Pi$ . The projected covariance becomes $S_i = J_i \Sigma_i J_i^\top$ .
Each pixel receives a splat contribution:

$w_i(x) = \alpha_i \exp\Bigl(-\tfrac12 (x - x_i)^\top S_i^{-1} (x - x_i)\Bigr)$

Per-pixel color is composited in strict back-to-front order:

$C(x) = \sum_{i=1}^N \Bigl(w_i(x)\prod_{j<i} (1 - w_j(x))\Bigr)c_i$

This pipeline supports both isotropic and anisotropic splats as required by the generator and variant type (Gong et al., 9 Dec 2025).

3. Per-Frame ONNX to WebGPU Execution Pipeline

The per-frame data flow is orchestrated entirely in-browser, without CPU-side post-processing or server dependencies:

Generator Stage: The Gaussian generator (ONNX) is executed on WebGPU via onnxruntime-web. Models may implement classic anchor-based, MLP-based, 4DGS deformation, or avatar LBS kinematics, returning standard outputs.
Pre-packing & Upload: Outputs are cast to FP16 and densely packed (two values per u32 word) for efficient GPU buffer upload.
WebGPU Preprocessing: A compute shader transforms and culls Gaussians, computes 2D projected covariances, and materializes per-Gaussian screen-aligned quads and depth keys.
GPU Radix Sort: A parallel sort is performed entirely on the GPU (O( $N \log B$ )), yielding a globally ordered index list for back-to-front blending.
Instanced Splat Rasterization: All sorted splats are rendered in a single pass, with optional mesh occlusion via a prior depth-prepass.

The data never leaves the GPU throughout these stages, ensuring minimal bandwidth, maximal memory locality, and frame-to-frame consistency (Gong et al., 9 Dec 2025).

4. GPU-Based Primitive Sorting and Compositing

Back-to-front compositing is essential for correct alpha blending in 3DGS. The pipeline:

Writes $N$ -length buffers of depths and indices in the compute stage.
Applies a fast WebGPU radix sort (McIlroy et al. 1993) over depths.
Binds the sorted index buffer directly to the rasterizer, enabling O(1) access in the vertex shader and guaranteeing global compositing order.

This approach eliminates the need for CPU-bound sorts (as in SparkJS) or approximate local sorts (SuperSplat). Empirical results report 100×–150× lower frame times for sorting and end-to-end rendering in large-scale scenes with millions of Gaussians, with no local sorting artifacts or lag under rapid viewpoint changes (Gong et al., 9 Dec 2025).

Benchmark Comparison Table

Gaussians	SparkJS (WebGL + CPU sort)	Visionary (WebGPU all-in-shader)
6.06 M	176.9 ms	2.09 ms
3.03 M	145.8 ms	1.09 ms
1.52 M	46.3 ms	0.60 ms
0.76 M	33.8 ms	0.40 ms

Visionary achieves up to approximately 135× end-to-end speedup, with output quality (PSNR/SSIM/LPIPS) comparable to or exceeding SparkJS (Gong et al., 9 Dec 2025).

5. Extensibility: Custom ONNX Generators and Post-Processors

The contract enables fully client-side, browser-executed generative and enhancement networks, including style transfer, diffusion denoising, and neural avatars. The required ONNX I/O must conform to the contract: custom networks, e.g., style nets, are trained/exported in PyTorch and exported with precise input/output naming and axes, then loaded and invoked per-frame via TypeScript APIs. Further, optional post-processing modules can act on rendered outputs (e.g., color images) and primitive buffers simultaneously. This allows complex generative pipelines (such as real-time style transfer or appearance editing) to be executed entirely in-browser without Python backends or server compute (Gong et al., 9 Dec 2025).

6. Integration, Applications, and Significance

By decoupling neural generator logic (authored in PyTorch or similar) from the rendering backend, ONNX-based Gaussian Generators make it practical to experiment with and deploy new 3DGS-family algorithms, including reconstructive, generative, and dynamic content paradigms. Visionary supplies a concise TypeScript API for per-frame generator invocation and primitive upload, optimized GPU memory packing, and unified render loop management for traditional mesh and 3DGS-based scenes. This structure significantly lowers the technical barrier for reproduction, extension, and comparison of state-of-the-art neural rendering approaches, supporting real-time operation and extensibility in the browser environment (Gong et al., 9 Dec 2025).

The zero-install pipeline—ONNX generator to WebGPU rasterizer—addresses the main historical bottlenecks: fragmented pipelines, CPU-GPU dataflow inefficiencies, and lack of generativity or throughput in interactive viewers. Its architecture enables both academic benchmarking and deployment in production systems, serving as a unified substrate for world model rendering and generative visual effects.

Markdown Report Issue Upgrade to Chat

References (1)

Visionary: The World Model Carrier Built on WebGPU-Powered Gaussian Splatting Platform (2025)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to ONNX-based Gaussian Generator.