SuperQuadric Voxelization Module

Updated 28 November 2025

SuperQuadric Voxelization Module is a computational tool that translates parametric superquadric primitives into discrete 3D voxel grids, enabling efficient semantic and granular analysis.
It employs a GPU-optimized pipeline with bounding-box pruning and occupancy accumulation to achieve real-time inference speeds of up to 21.5 FPS on high-performance hardware.
The approach reduces memory footprint by up to 75% compared to Gaussian-based methods while improving accuracy by +5.9% mIoU in semantic occupancy tasks.

The SuperQuadric Voxelization Module is a computational architecture designed to convert continuous superquadric-based shape and occupancy representations into discrete 3D voxel grids for efficient evaluation, analysis, and benchmarking. By translating parametric superquadric primitives—capable of modeling a wide variety of geometric forms—into a voxel-based format, it underpins a range of applications from semantic scene understanding in automated driving to granular packing analysis in discrete element simulation. The module exploits the compactness and shape expressiveness of superquadrics, outperforming traditional Gaussian or multisphere-based approaches in both memory efficiency and real-time inference speed across diverse settings (Hayes et al., 21 Nov 2025, Zuo et al., 12 Jun 2025, Pola et al., 2021).

1. Superquadric Parameterization and Mathematical Formulation

Superquadrics are parameterized implicit surfaces defined by the equation

$F(x, y, z) = \left( \left| \frac{x}{s_x} \right|^{2/\varepsilon_2} + \left| \frac{y}{s_y} \right|^{2/\varepsilon_2} \right)^{\varepsilon_2/\varepsilon_1} + \left|\frac{z}{s_z}\right|^{2/\varepsilon_1}$

where $(s_x, s_y, s_z)$ specify the semi-axes (scales) and $(\varepsilon_1, \varepsilon_2)$ are shape exponents controlling "squareness" or "roundness" (Hayes et al., 21 Nov 2025, Zuo et al., 12 Jun 2025).

For a given primitive $S_i$ , the global-to-local transformation is defined as:

$p_s = R_i\, (p - \mu_i)$

with $R_i$ as the rotation matrix and $\mu_i$ as the center. The inside–outside function evaluated at point $p$ is:

$F_i(p) = \left( \left| \frac{p_{s,x}}{s_x} \right|^{2/\varepsilon_2} + \left| \frac{p_{s,y}}{s_y} \right|^{2/\varepsilon_2} \right)^{\varepsilon_2 / \varepsilon_1} + \left| \frac{p_{s,z}}{s_z} \right|^{2/\varepsilon_1}$

The unnormalized occupancy density is then defined as:

$p_o^{(i)}(p) = \exp(-F_i(p))$

and each superquadric is further associated with a learned opacity $\sigma_i$ and semantic-logit vector $c_i$ (Hayes et al., 21 Nov 2025, Zuo et al., 12 Jun 2025, Pola et al., 2021).

2. Voxelization Algorithm and Pipeline

The SuperQuadric Voxelization Module translates superquadric representations into a discrete grid by direct evaluation at voxel centers. The pipeline consists of the following principal steps (Hayes et al., 21 Nov 2025, Zuo et al., 12 Jun 2025, Pola et al., 2021):

Voxel Grid Setup: Define a regular 3D grid over the region of interest, specifying voxel size $r$ and computing integer grid dimensions $(N_x, N_y, N_z)$ .
Bounding-Box Pruning: For each superquadric, compute an axis-aligned bounding box (AABB) by inflating its scale parameters, reducing the number of voxels to be tested.
Occupancy and Semantics Accumulation: For each voxel within a superquadric's AABB:
- Transform the voxel center to local superquadric coordinates.
- Evaluate $F_i(p)$ and compute occupancy weight $w = \exp(-F_i(p))$ .
- Skip voxels with $w < \varepsilon_{\text{thresh}}$ (early pruning).
- Accumulate $w \cdot \sigma_i$ in the opacity buffer $V_o[v]$ , and $w \cdot c_i$ in the semantic buffer $V_c[v]$ .
Final Label Assignment: After all primitives have contributed:
- If $V_o[v] < \tau_{\text{free}}$ , voxel is labeled as free.
- If $V_o[v] \geq \tau_{\text{free}}$ , assign class via $\arg\max(V_c[v]/V_o[v])$ .

This procedure is parallelizable, typically implemented as fused CUDA kernels with all primitive and voxel buffers resident on the GPU (Hayes et al., 21 Nov 2025).

3. Training-Time Gaussian Approximation and Adaptive Strategies

During training, direct supervision over superquadric primitives is not straightforward due to the lack of differentiable superquadric rasterization. To enable standard Gaussian loss regimes, superquadrics are approximated via a shell of small 3D Gaussians (Hayes et al., 21 Nov 2025):

Radial scales $K = \{k_1, \ldots, k_L\}$ are chosen to generate multi-layer shells.
A tessellated icosphere at each layer provides sampling directions; the corresponding surface points of the scaled superquadric are used as Gaussian means.
Each Gaussian inherits semantic logits and receives local covariance matching the mesh geometry.
Gaussian opacities are scaled to match the superquadric density at their center.

At inference, these proxies are discarded and only the closed-form occupancy function is employed for maximum efficiency (Hayes et al., 21 Nov 2025).

QuadricFormer further introduces a pruning-and-splitting mechanism post quadric-encoder blocks: primitives with negligible scales are pruned to avoid redundancy, while those with excessive scales (overly coarse) are split along principal axes and refined, maintaining a fixed primitive budget and enhancing spatial adaptability (Zuo et al., 12 Jun 2025).

4. Data Structures and Computational Considerations

Efficient data representations and parallelization strategies are critical for high-performance voxelization:

Structure-of-Arrays (SoA): Superquadric parameters (means, rotations, scales, opacities, shape exponents, semantic logits) are stored in SoA format to ensure coalesced global memory access on GPU hardware (Hayes et al., 21 Nov 2025).
Sparse Voxel–Primitive Indexing: Primitives are binned in a spatial grid to limit each voxel’s neighbor search to a small local list (typically $|N(v)|\approx5$ ), reducing computational complexity from $O(NV)$ to $O(V|N|)$ (Hayes et al., 21 Nov 2025).
Memory Buffers: Occupancy and semantic buffers use low-precision (float16) intermediate accumulation to reduce bandwidth, with final operations in float32 for stability.
Voxel Grid: Implemented as $uint8$ or $bool$ arrays for packing analysis, with typical grid sizes dictated by application (e.g., $200\times200\times16$ for scene occupancy, up to $250\times250\times450$ for granular assemblies) (Hayes et al., 21 Nov 2025, Pola et al., 2021).
Bounding-Box Only Subgrid Processing: AABB computation confines evaluation to potentially nonempty voxels, offering up to 2–3 $\times$ speedup (Hayes et al., 21 Nov 2025, Zuo et al., 12 Jun 2025).

5. Accuracy, Performance Metrics, and Scaling

The SuperQuadric Voxelization Module is associated with quantifiable accuracy, speed, and resource metrics:

Approach	Primitives	Memory Footprint	Inference Speed	mIoU (improvement)
GaussianFlowOcc	10,000 Gaussians	2.85 GB	—	—
SuperQuadricOcc	1,600 SQs	0.70 GB	21.5 FPS	+5.9%
Gaussian-only Voxelizer	1,600 Gaussians	0.62 GB	20.4 FPS	—

A reduction in primitives by 84% (from 10,000 Gaussians to 1,600 superquadrics) yields a $~75\%$ decrease in memory consumption and a $124\%$ speedup relative to dense Gaussian rasterization, with a notable improvement of $+5.9\%$ in mean IoU on Occ3D datasets (Hayes et al., 21 Nov 2025).
Voxelization adds only $\sim5$ ms per frame (of total $46$ ms, i.e., $21.5$ FPS on A100 GPU).
Accuracy of volume recovery can reach errors $<2\%$ at voxel sizes $r$ corresponding to $10\%$ of the smallest superquadric axis; CPU voxelization of individual particles ranges from $10$ to $50$ ms in granular packing analysis (Pola et al., 2021).
Pruning and AABB optimization are central: disabling them increases compute cost $2-3\times$ with no tangible accuracy gain (Hayes et al., 21 Nov 2025).

6. Practical Applications and Extensions

The SuperQuadric Voxelization Module is integral to several domains:

Self-Supervised Semantic Occupancy Estimation: Underlies real-time scene understanding, enabling per-voxel IoU, mIoU, and RayIoU computation for autonomous driving benchmarks (Hayes et al., 21 Nov 2025, Zuo et al., 12 Jun 2025).
Granular Packing Analysis: Facilitates post-processing of discrete element simulations, allowing extraction of packing fraction, wall effect, stacking, and local density in arbitrary subdomains (Pola et al., 2021).
Scene Representation Compression: By leveraging the geometric expressiveness of superquadrics, representation budgets are reduced without loss of structural detail, yielding higher efficiency in memory and compute-limited deployments (Hayes et al., 21 Nov 2025, Zuo et al., 12 Jun 2025).

Additional extensions provided by the referenced works include voxel-based packing fraction analysis, 2D/3D heat-maps, and support for arbitrary cross-sectional statistics (Pola et al., 2021).

7. Summary of Advantages, Limitations, and Impact

Utilizing superquadrics for voxelization offers several measurable advantages:

Efficiency: Single superquadric primitives model complex geometry typically requiring numerous Gaussians or spheres.
Speed: Direct evaluation of closed-form occupancy at inference and use of GPU-optimized operations yield real-time throughput in dense scene prediction (Hayes et al., 21 Nov 2025).
Compactness: Reduces oversegmentation and redundant computation through adaptive primitive allocation and effective pruning/splitting (Zuo et al., 12 Jun 2025).
Robustness: Voxel thresholding exhibits stability for a wide range of opacity cutoffs (IoU sensitivity $\pm0.3\%$ ) (Hayes et al., 21 Nov 2025).

A plausible implication is that the technique generalizes effectively across different scales and scene types, as evidenced by applications in automotive and particle simulation contexts.

Key limitations include increased implementation complexity relative to sphere-only or Gaussian-only schemes and the need for careful parameterization (voxel size, thresholds, shell tessellation) to maximize trade-offs between accuracy and efficiency. Nonetheless, the approach is established as performant and scalable for both real-time perception and scientific analysis tasks (Hayes et al., 21 Nov 2025, Zuo et al., 12 Jun 2025, Pola et al., 2021).