Triangle Soup Representation in 3D Graphics

Updated 24 June 2026

Triangle Soup Representation is an unstructured collection of triangles in 3D space, crucial for dynamic scene representation, neural rendering, and graphics applications.
The method supports differentiable rendering and robust simplification, enhancing precision in scene manipulation and process efficiency.
Applications extend to SLAM, VR/AR, and dynamic editing, fostering real-time rendering, compression, and scene interactability.

A triangle soup representation is a collection of unstructured triangles in three-dimensional space, in which triangles may overlap, may be disconnected, and possess no guaranteed topological or mesh connectivity. This minimalist abstraction serves as an explicit, hardware-friendly primitive for representing 3D geometry, appearance, and dynamics, with particular relevance to computer graphics, 3D vision, neural rendering, and geometry processing. Recent advances reformulate triangle soups to support differentiable rendering, dynamic scene manipulation, and robust simplification, yielding state-of-the-art performance in fidelity, editability, and efficiency.

1. Formal Definition and Data Structures

A triangle soup is a tuple $(V,\,T,\,A)$ , where $V = \{\mathbf{v}_i \in \mathbb{R}^3\}$ is a set of vertices, $T = \{(i,j,k)\}$ is a set of triangles referencing vertices by index, and $A$ is a list of attributes attached to each vertex or triangle—such as color, opacity, or neural features (Chou et al., 2016, Tojo et al., 28 Mar 2026, Fry et al., 29 May 2026, Held et al., 25 May 2025). In the most general case, triangles need not share vertex indices even when they are geometrically colocated. No additional connectivity (e.g., edge adjacency) is assumed, and the collection may contain degenerate faces. The unstructured nature of triangle soups provides robustness to noise and degeneracies and allows floating, overlapping, or disjoint polygons.

For time-varying scenes, this representation is extended as a sequence $\{ P^{(t)} = (V^{(t)}, T^{(t)}, A^{(t)}) \}$ , optionally with temporally consistent triangle indices to enable correspondence across frames (Chou et al., 2016). Implementation typically relies on array-based storage of vertex and triangle lists, possibly augmented with per-triangle or per-vertex textures or neural features for downstream tasks (Tojo et al., 28 Mar 2026).

2. Conversion to Higher Structures and Topological Augmentation

In raw form, triangle soups lack explicit edge or face connectivity. Applications that require topological queries or mesh editing can convert a triangle soup into a simplicial 2-complex $K = (V, E, F)$ , where $E$ is an edge set constructed as follows (Liu et al., 2024):

"Physical" edges are induced by the union of all triangle vertex pairs.
"Virtual" edges are added to bridge spatially neighboring, disconnected components. These are identified from centroidal proximity and geometric closest-point distance via KD-tree queries.
The complex is non-manifold: edges may be shared arbitrarily, and manifoldness checks are explicitly omitted.

Adjacency lists such as $\mathrm{star}_0(v)$ (edges incident to a vertex), $\mathrm{star}_1(e)$ (faces incident to an edge), and $\mathrm{star}'_0(v)$ (faces incident to a vertex) support efficient traversal and topological updates (Liu et al., 2024).

This topological lifting enables robust edge-collapsing mesh simplification even in the presence of non-manifold or fragmented inputs, as illustrated by "Simplifying Triangle Meshes in the Wild" (Liu et al., 2024).

3. Differentiable Rendering and Radiance Field Optimization

Recent neural field paradigms leverage triangle soups as differentiable, trainable scene primitives. Each triangle in the soup is parameterized by world-space vertices, color or neural texture features, and an opacity or smoothness parameter (Held et al., 25 May 2025, Tojo et al., 28 Mar 2026, Burgdorfer et al., 29 May 2025, Kupyn et al., 23 Jun 2026). Differentiable rendering proceeds as follows:

Each triangle is projected into screen space. For splatting, a soft window function (e.g., exponentiated, normalized signed distance from the triangle incenter, or a product "window" over edges) determines per-pixel coverage (Held et al., 25 May 2025, Kupyn et al., 23 Jun 2026).
Opacity and color are distributed according to soft compositing in front-to-back depth order, enabling gradients to propagate to all triangle parameters (Kupyn et al., 23 Jun 2026, Held et al., 25 May 2025).
Rasterization can be augmented with stochastic masking (binary opacity) for unbiased gradient estimates, as in DiffSoup (Tojo et al., 28 Mar 2026).
Texture mapping is managed either via barycentric interpolation of neural features or per-triangle neural textures evaluated by MLPs (Tojo et al., 28 Mar 2026, Burgdorfer et al., 29 May 2025).
Optimization objectives combine photometric and structural similarity losses, with regularizers for normal smoothness and triangle shape.

This framework achieves real-time or near-real-time rendering (500–2,400 FPS), state-of-the-art perceptual fidelity (e.g., best LPIPS scores among non-volumetric representations), and high geometric accuracy (normal-cosine up to ≈0.85) (Held et al., 25 May 2025, Kupyn et al., 23 Jun 2026).

4. Applications in Dynamic, Editable, and SLAM Contexts

Triangle soups are a foundational primitive in multiple domains:

Dynamic Scene Compression and Streaming: As shown in "Dynamic Polygon Clouds," triangle soups enable robust geometry and color compression for dynamic VR/AR, outperforming both point clouds and manifold meshes in bit-rate and resilience to capture noise (Chou et al., 2016). Time-varying soups maintain corresponding triangle indices for efficient inter-frame coding.
Real-time SLAM and Scene Mapping: "Triangle Splatting SLAM" introduces dense RGB-D mapping/tracking using differentiable triangle soups. The map can be meshed on-the-fly via restricted Delaunay triangulation, supporting rapid geometry extraction, mesh deformation, and collision checking directly from an initially disconnected soup (Fry et al., 29 May 2026).
Editable Dynamics: "D-MiSo" leverages triangle soups for explicit, artist-controllable dynamic 3D scene editing. Triangles are grouped via barycentric offsets from core faces, with trajectory and deformation modeled by lightweight neural networks, offering fine-grained control in dynamic neural rendering (Waczyńska et al., 2024).
Feedforward Scene Generation: "FLAT" decodes triangle splats directly from video diffusion latents via a ray-centered local parameterization, yielding explicit surface representations game engines can consume, in contrast to volumetric Gaussian splatting (Kupyn et al., 23 Jun 2026).

5. Soft Connectivity, Optimization, and Mesh Extraction

While triangle soups are unstructured, multiple works introduce "soft connectivity forces" or post-hoc strategies to encourage surface coherence (Burgdorfer et al., 29 May 2025):

Connectivity Forces: During inference-time optimization, geometric penalties are added between triangle pairs whose edges nearly align in 3D, encouraging smooth, implicit surface formation without explicit connectivity (Burgdorfer et al., 29 May 2025). Forces are based on Euclidean edge alignment and normal consistency.
On-the-fly Mesh Extraction: Restricted Delaunay triangulation on vertices provides a watertight, manifold mesh for downstream tasks such as simulation, collision checking, and physically based shading, constructed directly from the soup after parameter refinement or differentiable rendering (Fry et al., 29 May 2026).
Refinement and Binarization: Postprocessing steps such as opacity binarization, photometric refinement, and Laplacian repair enable conversion of a soft, possibly semi-transparent triangle soup into an opaque, watertight mesh compatible with real-time hardware rendering (Kupyn et al., 23 Jun 2026).

6. Compression Algorithms and Performance

Triangle soups support highly efficient geometry and attribute compression:

Octree/Voxel Quantization: Vertex positions are quantized onto an octree grid, encoded via Morton codes, and grouped to reduce redundancy and noise (Chou et al., 2016).
Region-Adaptive Hierarchical Transform (RAHT): Colors and attributes are compressed via RAHT, which leverages voxel-level redundancy in a hierarchical fashion, yielding bit rates 5–10× lower than octree-only point cloud coders for geometry, and inter-frame motion/color prediction enables a further 2–5× reduction (Chou et al., 2016).
Bit Rate and Quality: Dynamic triangle cloud codecs outperform prior state-of-the-art point cloud methods by ≈33% in bit rate for comparable fidelity, with triangle attributes supporting rich color and temporal coherence (Chou et al., 2016).

7. Advantages, Limitations, and Research Trajectory

Advantages:

Robustness to non-manifoldness, gaps, and degenerate structures: pipeline never fails on "wild" inputs (Liu et al., 2024, Chou et al., 2016).
Hardware compatibility, high throughput rendering, and real-time interactive rates on commodity GPUs (Held et al., 25 May 2025, Tojo et al., 28 Mar 2026).
Adaptive density, surface-centric (as opposed to volumetric) representation, and direct compatibility with mesh-based applications (Held et al., 25 May 2025, Fry et al., 29 May 2026).
Explicit surface geometry supporting direct editing, collision, and simulation (Fry et al., 29 May 2026, Waczyńska et al., 2024).

Limitations:

Lack of built-in topological connectivity requires explicit remedies for watertight mesh construction or soft constraints during optimization (Held et al., 25 May 2025, Burgdorfer et al., 29 May 2025).
In very unstructured outdoor scenes, triangle sampling can result in "floaters" or disconnected fragments, sometimes reducing PSNR despite high perceptual quality (Held et al., 25 May 2025).
Memory use may be higher than for very coarse voxel fields in large scenes (Fry et al., 29 May 2026).
For some tasks, achieving topological consistency online is still a challenge, leading to deferred meshing (Fry et al., 29 May 2026).

These advances position triangle soup representations as a unifying primitive between meshes and point clouds, enabling robust, editable, and efficient 3D scene representations for graphics, vision, and robotics (Chou et al., 2016, Liu et al., 2024, Fry et al., 29 May 2026, Burgdorfer et al., 29 May 2025, Kupyn et al., 23 Jun 2026).