Z-Monotonic Signed Distance Field (SDF)

Updated 16 December 2025

Z-Monotonic SDF is a signed distance field computed along the positive Z-axis, ensuring directional monotonicity through strict analytical constraints.
Its methodology uses an MLP with implicit differentiation and monotonic squashing functions to offer precise surface prediction and gradient accuracy.
Applications span differentiable rendering, neural shape representation learning, and mesh extraction techniques like Marching Cubes for robust 3D modeling.

Differentiable rendering techniques refer to approaches that enable the computation of gradients of rendered images or fields with respect to scene parameters. These techniques underpin modern neural shape representation learning, inverse graphics, and various neural scene reconstruction pipelines. Recent formulations have emphasized implicit differentiation and neural network parametrizations to facilitate high-fidelity reconstruction, optimization, and downstream tasks.

1. Mathematical Foundation: Signed Directional Distance Functions

A central formulation in differentiable rendering employs the Signed Directional Distance Function (SDDF) $h: \mathbb{R}^3 \times S^2 \to \mathbb{R}$ , which, for a closed object $O \subset \mathbb{R}^3$ , gives for any point $p \in \mathbb{R}^3$ and unit direction $d \in S^2$ the signed distance along $d$ from $p$ to the boundary $\partial O$ :

$h(p,d) := d_{(d)}(p,\partial O), \quad \text{where} \quad d_{(d)}(p,\partial O) := \min\{t \in \mathbb{R} \mid p + t d \in \partial O\}.$

For the special case $d = e_3 = (0,0,1)$ , the SDDF reduces to a Z-monotonic Signed Distance Field (SDF) $f_Z(x) = h(x, e_3)$ , which measures signed distance along the positive Z axis. The SDDF generalizes classic SDFs by orienting measurement along an arbitrary direction, enabling the decoupling of spatial and directional dependencies in implicit neural shape representation (Zobeidi et al., 2021).

2. Monotonicity Constraint and Theoretical Guarantees

Any valid SDDF must satisfy the directional Eikonal constraint:

$\frac{\partial}{\partial t} h(p + t d, d) = -1$

for all valid $(p, d, t)$ . In differential notation:

$\nabla_p h(p,d)^\top d = -1.$

For $d = e_3$ , this simplifies to $\partial h / \partial z = -1$ . This monotonicity directly constrains the function class of admissible SDDFs, reducing solution space dimensionality and offering analytical guarantees: surface-point prediction error does not increase with sample-to-surface distance, obviating the need for dense near-surface sampling. The construction enforces that values decrease linearly along the chosen direction (Zobeidi et al., 2021).

3. Neural Network Parameterization and Input Encoding

The SDDF is realized by parameterizing $h(p,d)$ via an MLP, designed to enforce the monotonicity constraint by construction. For $(p, d)$ , define $g(p,d) = h(p,d) + p^\top d$ . Under appropriate rotation $R_d$ (aligning $d$ with $e_n$ ), dropping the last coordinate, one encodes input as:

$(x', y') := (x, y)$ after projection $(x, y, z) \mapsto (x, y)$ for $d = e_3$ ,
Concatenate these with the direction encoding and possible latent code,
Pass the vector through a deep MLP (e.g., depth 16, width 512, periodic/softplus activations, skip connections).

The network outputs a scalar $q(p,d,z)$ , and the signed distance is computed via

$h(p,d) = \phi^{-1}(\min(q, \phi(\infty))) - p^\top d$

where $\phi$ is a strictly monotonic squashing function (e.g., sigmoid, tanh, erf) with well-defined cap $\phi(\infty)$ (Zobeidi et al., 2021). For $d = e_3$ , no rotation is necessary, and only $(x, y)$ and the network are needed.

4. Training Objective and Data Requirements

The loss aggregates contributions from finite and infinite depth rays. Define measurement sets $F$ (finite) and $I$ (infinite), and hyperparameters $\alpha, \beta, \gamma > 0$ , $p \in \{1,2\}$ :

$\ell(\Theta; F, I) = \alpha |F|^{-1} \sum_{(p,d,d_{\text{true}}) \in F} |\phi(d_{\text{true}} + p^\top d) - q_\Theta(p,d)|^p \ + \beta |I|^{-1} \sum_{(p,d,\infty) \in I} r(\phi(\infty) - q_\Theta(p,d))^p \ + \gamma \|\Theta\|^p,$

where $r(x) = \max(0,x)$ or softplus and $\Theta$ denotes network weights. For category-level learning, a latent code $z_l$ with regularizer $\sigma \|z_l\|^p$ is included. When only depth measurements are available (e.g., Lidar, depth sensors), the model can learn without direct 3D supervision. Analytical properties are preserved under the squashing function $\phi$ (Zobeidi et al., 2021).

5. Training Workflow and Inference Procedure

The Z-monotonic SDDF instantiation follows a standard supervised learning pipeline:

Training Loop

while not converged:
    Sample minibatch F_b ⊂ F, I_b ⊂ I
    Compute q_i = q_Θ(p_i, e3, z) for i in F_b ∪ I_b
    ℓ_f = mean_{i ∈ F_b} |φ(d_i + p_i.z) - q_i|^p
    ℓ_∞ = mean_{j ∈ I_b} r(φ(∞) - q_j)^p
    ℓ_reg = γ‖Θ‖^p (+ σ‖z‖^p for category-level)
    ℓ = α ℓ_f + β ℓ_∞ + ℓ_reg
    Θ ← Θ - η ∇_Θℓ
    if category-level: z ← z - η_z ∇_zℓ

Inference and Mesh Extraction

1 2	q = q_Θ([x, y, z], e3, z) h = φ⁻¹(min(q, φ(∞))) - z # Yields f_Z(x)

The network outputs can be evaluated over a grid, and the zero-level set extracted by Marching Cubes to produce a surface mesh (Zobeidi et al., 2021).

6. Analytical Properties and Confidence Guarantees

The proposed formulation ensures that the monotonicity constraint $\nabla_p h(p, d)^\top d = -1$ is obeyed exactly due to architectural structure. Lemma 1 confirms that defining $g = F(P R_d p, d)$ suffices to enforce the constraint. Lemma 4 further demonstrates that any strictly monotonic squashing function $\phi$ used to stabilize the output preserves this property. Proposition 1 guarantees linear decrease along the specified direction for all SDDF outputs, and that prediction error does not increase with distance from the surface. Thus, sampling need not be denser near the boundary, simplifying data collection (Zobeidi et al., 2021).

7. Model Scope, Assumptions, and Limitations

The Z-monotonic SDDF is intrinsically limited to representing distances along the positive Z axis. Consequently, objects featuring overhangs or undercuts relative to Z may yield infinite distances for some queries (rays never intersect the surface). The fixed-direction design obviates the need for rotation or direction encoding, simplifying computation but restricting the class of representable view geometries. Both finite and infinite rays along Z must be present in training data at all relevant $(x, y)$ positions. For category-level models, test-time code optimization requires good initialization, such as a mean latent vector. RGB supervision and color-based cues are not utilized; all supervision is from distance measurements (Lidar, depth only). Limitations also include bias introduced near infinite depth by the squashing/capping function, and the mesh extraction process is subject to standard grid and marching cubes discretization artifacts (Zobeidi et al., 2021).

PDF Markdown Chat (Pro)

References (1)

A Deep Signed Directional Distance Function for Object Shape Representation (2021)

Whiteboard

Generate a whiteboard explanation of this topic.

Follow Topic

Get notified by email when new papers are published related to Z-Monotonic Signed Distance Field (SDF).