Filter-Based Geometry Conversion
- Filter-Based Geometry Conversion is a family of methods that convert geometric data via constrained filters rather than direct editing, ensuring invariants like color-geometry consistency and equivariance.
- It applies across domains such as radiance field stylization using depth guides, mesh transformation via production tables, topological data analysis through adaptive box filtration, and filter transforms in steerable CNNs.
- These techniques balance flexibility and structural preservation, enabling robust conversions that maintain properties like topology, boundary orientation, and group symmetries.
Filter-based geometry conversion denotes, in the current literature, a family of procedures in which geometric structure is transformed by a filtering rule, guide, or formal transform rather than by unrestricted direct editing. In one line of work, a depth map acts as a style guide that deforms a radiance field while preserving color–geometry consistency; in another, a computational mesh is rewritten by local production rules on its Hasse diagram and can even be represented ephemerally; in topological data analysis, a point cloud is converted into a nested nerve complex by adaptive box growth; and in steerable CNNs, a learned filter is converted into a group-indexed kernel by rotations and reflections that satisfy equivariance constraints (Jung et al., 2024, Knepley, 19 Jun 2025, Alvarado et al., 2024, Li et al., 2021).
1. Conceptual scope
The cited work uses the word filter in several technically distinct senses. In radiance-field stylization, the operative filter is a style guide derived from depth or RGB-D cues. In computational meshes, the operative filter is a local graph transformation that can select, retain, suppress, or restructure mesh entities. In box filtration, the operative filter is a filtration over adaptive box covers whose nerves form a nested simplicial-complex sequence. In FILTRA, the operative filter is a transformed copy of a base convolution kernel organized by group representations (Jung et al., 2024, Knepley, 19 Jun 2025, Alvarado et al., 2024, Li et al., 2021).
| Domain | Conversion | Operative mechanism |
|---|---|---|
| Radiance fields | content scene stylized geometry or RGB-D scene | depth-map style guide and deformation grid |
| Computational meshes | source mesh transformed or ephemeral mesh | table-driven grammar on the Hasse diagram |
| Point-cloud TDA | PCD box-cover nerve filtration | LP-guided box growth |
| Steerable CNNs | base filter group-indexed kernel | rotations/reflections under representation constraints |
A plausible implication is that “filter-based geometry conversion” is best understood as an umbrella description rather than a single standardized formalism. What is common across these works is not a shared data structure, but a shared strategy: conversion is mediated by a constrained operator that restricts how geometry can change while preserving some structural property, such as correspondence between color and density, boundary orientation, homotopy type, or group equivariance.
2. Depth-guided conversion of radiance-field geometry
“Geometry Transfer for Stylizing Radiance Fields” argues that 3D style transfer should not be limited to colors, textures, and brushstrokes, because shape and geometric patterns are essential in defining stylistic identity. Its pipeline first reconstructs the content scene with TensoRF, maintaining a color grid and a density grid , and then stylizes the scene from a style guide. In the geometry-transfer setting, the guide is a depth map ; in the full stylization setting, it is an RGB image plus a depth map . During stylization, the method renders novel views, extracts VGG-16 features from rendered outputs and style guides, and optimizes radiance-field parameters so that the scene matches the style in both appearance and geometry (Jung et al., 2024).
The central technical claim is that directly optimizing the density grid to match a depth style changes the geometry but leaves the appearance field fixed, so colors become misaligned with the deformed surface. The method therefore introduces a deformation grid 0 that predicts 3D offsets, with 1, while the canonical density field is kept intact. During stylization, 2 is fixed, 3 and 4 are optimized, VGG-16 conv2 and conv3 features are used, and view-consistent color transfer is applied before and after stylization. The deformation fields are initialized so that they output zeros for sampled input points.
The geometry-transfer mechanism treats a depth map as a style image. For a random camera viewpoint 5, the rendered depth map 6 is compared against 7 after depth is replicated across channels for VGG feature extraction. In the geometry-transfer case, RGB features are replaced by depth features, and the depth map functions as a shape/style template encoding silhouette, contouring, thickness, and coarse spatial form. The paper extends this with geometry-aware RGB-D stylization by concatenating RGB and depth features for nearest matching, computing separate RGB and depth losses from the same match, and averaging them over features. This joint matching is intended to prevent inconsistent pairings that would arise if appearance and geometry were matched independently.
The paper further states that geometry cannot be transferred well with per-pixel matching alone, because shape is relational rather than local. To address this, it introduces a patch-wise nearest-neighbor loss on 8 VGG patches, with optional dilation 9 to enlarge the receptive field. It also introduces perspective style augmentation: scene points are binned by 0-coordinate, style images are downsampled at multiple scales 1, and each depth layer is matched to a corresponding scale. This causes closer surfaces to receive larger patterns and distant surfaces smaller patterns.
The reported experiments are intended to show improvements in both appearance and geometry/style structure. On the trex scene, the method reports SIFID values of RGB 1.43, Gray 0.58, and Depth 0.44, compared with SNeRF at 1.62, 0.81, 0.59; ARF at 1.54, 0.64, 0.51; and Ref-NPR at 1.59, 0.72, 0.61. On the fern scene, it reports RGB 0.81, Gray 0.37, and Depth 0.28, compared with SNeRF at 1.32, 0.64, 0.40; ARF at 1.11, 0.48, 0.36; and Ref-NPR at 1.75, 0.79, 0.41. A user study with 22 participants over 12 stylized scenes gives an average rank of 1.55 for the proposed method, compared with 3.17, 2.58, and 2.70 for the baselines, and the method is selected as best 162 times out of 264 responses. The ablations attribute gains to geometry-aware matching, patch-wise optimization, and perspective style augmentation.
3. Mesh transformation as graph filtering and geometry conversion
“Transformations of Computational Meshes” models a mesh as a Hasse diagram, a directed acyclic graph whose vertices are mesh entities such as points, edges, faces, and cells. With 2 denoting a directed path of length 3 from 4 to 5, the paper defines 6 and states the duality 7, with 8 for closure and star. This graph formulation is the basis for treating many mesh modifications as local graph transformations rather than geometry-specific code (Knepley, 19 Jun 2025).
The transformation formalism is a restricted production-rule grammar. A rule acts on each source 9-cell and produces a set of target 0-cells together with their cones and oriented edges. Two locality conditions are emphasized. First, locality of production requires
1
from which the paper derives
2
Second, uniqueness of production assumes
3
which simplifies numbering and parallel execution and yields the support locality condition
4
This is the paper’s formal explanation for why a conversion can be local while still preserving recoverable adjacency and boundary information.
Each source cell type is encoded by compact tables specifying Nt, target, size, cone, and ornt. A cone entry is represented by a tuple-like list containing the target cell type, the number of cone levels to traverse, cone indices at each level, and a replica number. The paper gives explicit examples for regularly refined tetrahedra, including child segments, child triangles, and subtetrahedra, and it represents orientations with explicit metadata. Orientation preservation is further handled through dihedral-group lookup tables. For triangles, orientations are represented as 5, and for parent orientation 6, child replica number 7 and child orientation 8 are computed from a lookup table 9 by
0
The final child orientation is the composition of its inherited orientation with 1.
The paper also gives a contiguous numbering scheme. If 2 is a parent point with transformation type 3, and 4 is a child cell type 5 with replica number 6, then
7
where 8 is the offset of the first child of type 9, 0 is the reduced parent index among its type, and 1 is the number of replicas produced per parent. In parallel, only a small amount of offset data is communicated by a single allreduce, and remote numbering is patched through PetscSF.
Regular refinement and extrusion serve as the paper’s primary examples. Regular refinement rewrites each cell into a structured set of children, and the paper states that the same table-based mechanism works for quadrilaterals, hexahedra, and pyramids. Extrusion maps a mesh to a higher-dimensional one by replacing each cell with a prism-like cell formed by two copies of the original cell as faces. For extrusion across 2 layers, the transformation object again stores Nt, target, size, cone, and ornt. For a vertex it produces point and segment entities; for a triangle it produces triangle and prism-like entities, with the bottom endcap orientation reversed in the ordinary-prism example:
3
A distinctive implementation idea is the ephemeral mesh. Given a base mesh and a transform table, DMPlexCreateEphemeral() behaves like a regular DM but computes cones, closures, and related queries on demand rather than storing a transformed mesh explicitly. The paper states that query cost is output sensitive, proportional to the size of the answer. It also discusses surface-restricted extrusion via marker labels and adaptive refinement driven by labels or tags, including PETSc VecTagger and Dörfler marking. In this sense, mesh conversion is simultaneously rewriting, filtering, and virtualization.
4. Box filtration as geometry-to-topology conversion
“Box Filtration” defines a framework that replaces isotropic ball growth with adaptive box growth. Starting from a finite point cloud data set 4, the method assigns to each initial pivot region a box
5
Instead of growing Euclidean balls symmetrically around points, it expands boxes non-uniformly and asymmetrically across dimensions based on an optimization that trades off point inclusion against box size. The resulting construction is both a filtration and, because it is built from covers and nerves, mapper-like (Alvarado et al., 2024).
The box filtration is denoted 6. For a current box 7, the neighborhood
8
defines the admissible region for growth. Repeated expansions produce
9
with cover
0
and the associated simplicial-complex sequence
1
Because boxes are convex, the paper invokes the nerve lemma, so the nerve records the homotopy type of the cover.
Two initial-cover models are developed. In the point cover, each point receives its own initial box, and boxes may be lower-dimensional in some coordinates. In the pixel cover, ambient space is discretized into unit cubes, with a point’s pixel defined by
2
and the nonempty pixels used by a box represented by
3
The pixel cover is coarser but faster, and the paper states that the same stability and structural results hold for it.
The growth rule is defined by linear programming. In the point-cover version, for each point 4 one defines a weight
5
and minimizes
6
The first term rewards including more points and points farther inside the box; the second penalizes box size. In the pixel-cover version, analogous weights 7 are used with centroid-based constraints and pixel counts 8:
9
The paper’s emphasis is that the expansion is not forced to be the same in all dimensions.
The theoretical guarantees are unusually strong for a cover-based construction. If 0 and 1 are finite metric spaces with
2
then the persistence modules of the corresponding box filtrations are 3-interleaved, with
4
and parameter changes bounded by
5
The bottleneck distance then satisfies
6
The paper also proves an intersection property specific to boxes: if every pair of boxes intersects, then every higher-order intersection is nonempty. This is the reason it can claim that pairwise intersections already determine the full nerve, unlike the usual ball-based distinction between VR and Čech.
The reported runtime for the main algorithm is
7
where 8 is the number of expansion steps, 9 the number of initial boxes, 0 the growth increment, 1 the ambient dimension, 2, and 3 the time for one LP. A truncated 4-optimal expansion variant runs in
5
The paper states that the method can summarize noisy circles, noisy ellipses, circles with central clusters, and concentric circles with noise more accurately than VR and distance-to-measure in several examples, especially when the geometry is anisotropic. It also proposes a fast “box mapper” algorithm consisting of 6-means clustering, minimal enclosing boxes, one pixel-cover growth step, and output of the nerve.
5. Filter transform as representation conversion in steerable CNNs
“FILTRA: Rethinking Steerable CNN by Filter Transform” addresses a different but related use of geometry conversion. Here the object being converted is a learned spatial filter, and the target is a steerable kernel whose channels transform according to a group representation. The paper starts from the standard action of a transformation group 7 on a feature map:
8
and defines a steerable convolution operator by the equivariance condition
9
Its central claim is that the classic “rotate or flip a filter and stack the copies” recipe can be interpreted directly within the group representation theory of steerable CNNs (Li et al., 2021).
For the cyclic group 0, filter transform is written as
1
with 2. The resulting kernel is no longer a scalar filter but a vector-valued kernel whose output channels correspond to group elements or group states. The paper interprets this as a conversion from the trivial representation at the input to the regular representation at the output. For the dihedral group 3, reflected copies are added as well:
4
A major theoretical step is the decomposition of the regular representation into irreducible representations. For 5, the decomposition is written
6
and for 7,
8
The paper describes 9 as essentially a discrete cosine transform / Fourier-like basis and 00 as constructed from 01. This allows filter transform to be reinterpreted as a basis-specific realization of the same steerability constraints that appear in the harmonic-kernel theory of Weiler et al. The paper explicitly constructs kernels for trivial 02 regular, irrep 03 regular, and regular 04 regular mappings, and states that reverse directions follow by transposition when the relevant representations are orthogonal.
For 05 and irrep frequency 06, the irrep-to-regular kernel is
07
while for regular-to-regular mappings the paper gives basis-change constructions using 08 or 09. Its point is not merely that such kernels exist, but that they retain the intuitive implementation pattern of filter transform while inheriting the formal guarantees of steerable CNN theory. The paper therefore presents FILTRA as equivalent in spirit to the harmonic-basis construction of Weiler et al. for discrete groups 10 and 11, but simpler to code and understand.
The experimental section evaluates MNIST, KMNIST, FashionMNIST, EMNIST, and CIFAR-10 under 12 and 13. FILTRA is compared against R2Conv from E2CNN and vanilla convolution. The paper reports that FILTRA is generally comparable to R2Conv, sometimes slightly better on OCR-like datasets with simple textures, and slightly worse on CIFAR-10, which it attributes to interpolation artifacts in discrete rotated filters and high-frequency content. It also reports strong results on a regression task predicting character orientation as a 2D direction vector transforming like an irrep 14. The implementation discussion states that a minimal self-contained PyTorch version takes about 60 lines, that FILTRA and R2Conv have similar training-time generation cost for 15, that FILTRA is slightly faster for 16, and that both have the same inference-time cost as vanilla convolution.
6. Shared principles, distinctions, and limitations
Across these papers, filter-based geometry conversion repeatedly appears as constrained conversion rather than unconstrained modification. In radiance-field stylization, the deformation field is introduced precisely because direct density optimization breaks the coupling between geometry and appearance. In table-driven mesh transformation, locality and uniqueness conditions are imposed so that the transformed mesh can preserve boundaries, adjacency, orientations, and efficient numbering. In box filtration, adaptive box growth is constrained by an LP that balances inclusion against compactness, and the output is restricted to nerves of convex covers with stability guarantees. In FILTRA, transformed filter copies are constrained by representation structure so that the resulting kernel is equivariant rather than merely augmented (Jung et al., 2024, Knepley, 19 Jun 2025, Alvarado et al., 2024, Li et al., 2021).
A common misconception would be to treat “filter-based” as denoting only low-level signal filtering or only image-space operations. The cited work shows four different meanings: a depth-map guide for stylizing 3D geometry, a selective graph-rewriting rule over mesh entities, a filtration that converts geometry into persistent topological summaries, and a filter transform that converts a spatial kernel into a group-indexed steerable operator. This suggests that the unifying theme is procedural restriction and structural preservation, not a single mathematical object.
The limitations described in the papers are equally domain-specific. The radiance-field paper notes that per-pixel matching alone cannot transfer geometry well because shape is relational rather than local. The mesh-transformation paper states that current tables are manual and specific, and suggests that they should ideally come from a more compact mathematical encoding; it mentions possible future encodings using CW complexes or chain complexes, while also noting that orientation propagation likely needs richer structure than a chain complex with coefficients in 17. The box-filtration paper trades isotropic simplicity for LP-based optimization and associated parameter choices such as 18, 19, and the number of expansion steps. FILTRA reports interpolation artifacts for discrete rotated filters on high-frequency data such as CIFAR-10.
Taken together, these works indicate that filter-based geometry conversion is not a single algorithmic family but a recurring methodological pattern. Geometry is converted by a structured mediator—depth guide, production table, cover filtration, or group action—chosen so that some downstream invariant or consistency condition remains computable.