Papers
Topics
Authors
Recent
Search
2000 character limit reached

Voxel Grid Patchification

Updated 7 February 2026
  • Voxel grid patchification is a method that decomposes complex 3D geometries into local voxel feature grids to facilitate precise neural implicit representations.
  • It employs trilinear interpolation and a shared MLP to compute local signed distance fields, ensuring accurate reconstruction of intricate geometries.
  • Adaptive merging using octree-based structures and constructive solid geometry operations preserves sharp features while reducing computational demand and memory usage.

Voxel grid patchification is a technique for decomposing complex 3D shapes into structured local feature grids (“patch volumes”), each supporting accurate neural implicit representations, and then merging them adaptively using hierarchical data structures. The approach enables efficient, high-fidelity modeling of curved, sharp-featured, or boundary-rich geometries by learning local signed distance fields (SDFs) and applying constructive solid geometry (CSG) operations in a spatially localized and computationally tractable manner. The Patch-Grid method, as introduced in "Patch-Grid: An Efficient and Feature-Preserving Neural Implicit Surface Representation" (Lin et al., 2023), formalizes this approach and addresses key limitations of monolithic MLP-based models in terms of sharp feature preservation, efficiency, and memory footprint.

1. Patch Feature Volume Construction

The pipeline begins with a boundary representation (B-Rep) surface, partitioned into KK surface patches {Sp}p=1K\{S_p\}_{p=1}^K. For each patch pp, a mapping to a regular voxel grid (“patch volume”) VpV_p is constructed. Each VpV_p is defined as a union of non-empty voxels over the surface patch, subdivided into dimensions Npx×Npy×NpzN_p^x \times N_p^y \times N_p^z with resolution Δp\Delta_p. Every voxel grid corner holds a DD-dimensional learnable feature vector fp[i,j,k]RDf_p[i,j,k] \in \mathbb{R}^D.

For a query point xVpx \in V_p, its position is converted to local patch coordinates and neighboring voxel indices (i0,j0,k0)(i_0,j_0,k_0) are computed. The patch feature at xx, Fp(x)F_p(x), is obtained via trilinear interpolation over the grid corners:

Fp(x)=di=01dj=01dk=01w(di,dj,dk)fp[i0+di,j0+dj,k0+dk]F_p(x) = \sum_{d_i=0}^1 \sum_{d_j=0}^1 \sum_{d_k=0}^1 w(d_i,d_j,d_k) \cdot f_p[i_0+d_i, j_0+d_j, k_0+d_k]

with weights ww based on linear fractions of the fractional position within the grid cell.

A small, shared multilayer perceptron g:RDRg: \mathbb{R}^D \rightarrow \mathbb{R} (typically 3 hidden layers) is applied to yield the per-patch signed distance field sp(x)=g(Fp(x))s_p(x) = g(F_p(x)), producing a neural implicit fit of the local geometry within the patch. The zero-level set sp(x)=0s_p(x) = 0 approximates the patch surface SpS_p within VpV_p.

Supervision is provided by a combination of geometric and regularization losses, with sampled points both on the surface and in a local neighborhood. The complete loss functional per patch includes terms for level set fidelity, normal agreement, pseudo-SDF values, Eikonal regularization, off-surface penalties, and feature decay, with specific weighting coefficients (e.g., λsp=150\lambda_{sp}=150, λnp=50\lambda_{np}=50) set as in (Lin et al., 2023).

2. Localized Merging with Octree-Based Merge Grid

To accurately and efficiently handle intersections, edges, and corners between patches, a hierarchical merge structure is constructed:

  • An adjacency graph GG is created with each patch as a node; an edge connects two patches sharing a sharp geometric edge (either convex or concave). Maximally connected subgraphs (cliques) correspond to regions where three or more patches meet at a corner.
  • The 3D domain is recursively subdivided by constructing an octree: starting from the global bounding box C0C_0, cells are split unless all patch subgraphs in the cell are cliques. If so, subdivision stops and the cell is marked as a leaf. The depth parameter controls granularity (typical max_depth7\text{max\_depth}\approx 7).
  • At query or mesh extraction time, points xx are efficiently mapped to their enclosing octree leaf cells, each associated with the small subset PCP_C of patches present in that region.

This spatial data structure localizes patch interaction, such that patch merging is performed only where interaction occurs, preventing contamination of sharp features and preserving geometric complexity.

3. Constructive Solid Geometry Merging

Within each leaf cell CC of the octree, the relevant local SDFs sp(x)s_p(x) for patches in PCP_C are merged according to CSG logic:

  • Union (concave join): sunion(x)=minpPCsp(x)s_{union}(x) = \min_{p\in P_C} s_p(x).
  • Intersection (convex join): sinter(x)=maxpPCsp(x)s_{inter}(x) = \max_{p\in P_C} s_p(x).

To enable differentiability, soft-min and soft-max alternatives are used:

soft-mink({si})=1klogiexp(ksi)\text{soft-min}_k(\{s_i\}) = -\frac{1}{k} \log\sum_i \exp(-k s_i)

soft-maxk({si})=1klogiexp(ksi)\text{soft-max}_k(\{s_i\}) = \frac{1}{k} \log\sum_i \exp(k s_i)

A per-cell merge loss Lmerge(C)=xXCsC(x)L_{merge}(C) = \sum_{x \in X_C} |s_C(x)| is evaluated, where sC(x)s_C(x) is the locally merged SDF and XCX_C the sampled on-surface points, contributing to the global training objective.

4. End-to-End Patchification and Inference Workflow

The stepwise workflow is as follows:

  1. Preprocessing/Patchification: For each SpS_p, compute a minimally enclosing bounding box BpB_p, select grid resolution Δp\Delta_p based on local feature size, build the patch voxel grid VpV_p by subdividing and pruning, and initialize the feature grids.
  2. Octree Construction: Recursively subdivide the 3D domain using the aforementioned connectivity logic until the octree leaves localize the patch interactions.
  3. Training: For a number of iterations, perform:
    • Sampling of on- and off-surface points per patch.
    • Backpropagation of per-patch losses and per-leaf merge losses to update feature codes and MLP weights.

The total loss is aggregated as the average per-patch loss plus a weighted average of the per-leaf merge losses.

  1. Inference/Mesh Extraction:
    • For any spatial query xx, identify the relevant patch volumes VpV_p.
    • Compute sp(x)s_p(x) for each.
    • Use the octree to find the corresponding merge cell and compute the final SDF value.
    • Extract the mesh from the global SDF using Marching Cubes at isovalue 0.

5. Computational Complexity, Memory, and Benchmarks

The voxel grid patchification strategy in Patch-Grid significantly reduces computation compared to global grid or monolithic CSG merging approaches:

Approach Training Time (s) Memory Usage Merge Complexity
Patch-Grid ≈ 8 ~640 MB (5–10M × 32 × 4B) O(PC#samples)O(|P_C| \cdot \#samples)
NH-Rep (monolithic CSG) ≈ 185 ~2.6 GB (128³ × 32 × 4B) O(K)O(K)
NGLOD (global octree + MLP) ≈ 2296 N/A O(K)O(K)
  • In Patch-Grid, per-point evaluation cost is O(logN+PCD+MLP)O(\log N + |P_C| \cdot D + \text{MLP}), where PC|P_C| is typically $2$–$4$ and the hierarchical lookup in the octree is O(logN)O(\log N).
  • Global dense grids would require 2003200^3 feature codes, while patchified grids sum up to a much lower aggregate voxel count.
  • Patch-Grid achieves 20–300× speedup over alternatives in empirical benchmarks on RTX-4090 hardware, with comparable or superior surface reconstruction fidelity for CAD and other sharp-featured domains (Lin et al., 2023).

6. Implications for Neural Implicit Geometry Modeling

Voxel grid patchification enables neural implicit representations to faithfully reconstruct sharp and structured 3D geometry, open boundaries, and thin structures, which present significant challenges for conventional monolithic MLP-based models. By “patchifying” the domain and only merging local SDFs in regions of geometric interaction, the approach localizes both parameterization and computation, yielding high accuracy and orders of magnitude improvement in speed and memory demand.

A plausible implication is that such strategies are highly extensible to other domain decomposition tasks in graphics, CAD, and neural field modeling, particularly where feature preservation and computational resource constraints are critical. The architectural decoupling of encoding (local feature grids) and decoding (shared MLP) facilitates both flexibility and scalability, suggesting applicability to larger, more complex scenes and multimodal 3D data.

7. Summary

Voxel grid patchification, as instantiated in Patch-Grid, decomposes complex shapes into per-patch feature volumes that are efficiently trained with a shared MLP and later merged using adaptive, CSG-based operations within a sparse hierarchical structure. This enables ultra-fast training, precise geometric reconstruction, and effective handling of features and boundaries. The method achieves substantial gains over global voxel grid or monolithic CSG paradigms in both efficiency and feature fidelity, offering a robust foundation for further advances in neural implicit 3D shape representations (Lin et al., 2023).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Voxel Grid Patchification.