Soft-Position Embedding in Coordinate-MLPs

Updated 10 February 2026

Soft-position embedding is a learnable, smooth positional encoding that maps input coordinates using super-Gaussian functions with instance-specific scale parameters.
It employs a graph-Laplacian smoothness prior to balance fidelity to local details with overall generalization, outperforming traditional random Fourier features.
A two-stage optimization decouples the embedding hyperparameter fitting from MLP training, ensuring improved training stability and high-fidelity signal reconstruction.

Soft-position embedding is a learnable, smooth positional encoding scheme tailored for coordinate-MLPs, in which each input coordinate is mapped to a high-dimensional space via a parameterized, instance-specific transformation. This framework enables each coordinate to have its own local embedding bandwidth, governed by per-coordinate scale parameters. These scales are optimized using a graph-Laplacian smoothness prior, carefully balancing fidelity to complex local detail with generalization, and providing high stability in both training and inference. The methodology outperforms classical position encodings such as random Fourier features (RFF), particularly in high-fidelity signal and image regression and neural rendering tasks (Ramasinghe et al., 2021).

1. Super-Gaussian Instance-specific Embedding Construction

Soft-position embedding projects each coordinate $x\in\mathbb{R}^N$ into a $D$ -dimensional vector via super-Gaussian radial basis functions, with a learnable, instance-specific scale parameter $\sigma_x$ . The embedding is defined as: $\phi: \mathbb{R}^N \to \mathbb{R}^D, \quad \phi(x) = [\phi_1(x), \ldots, \phi_D(x)]^T$ with each component

$\phi_i(x) = \exp\left(-\frac{(x\cdot\alpha - t_i)^2}{2\sigma_x^2}\right)^b$

where $\alpha\in \mathbb{R}^N$ is a projection vector, $\{t_i\}_{i=1}^D$ are fixed offsets, $b>0$ is a fixed exponent, and $\sigma_x$ is the coordinate-dependent scale (learnable). This construction enables localized control over the embedding bandwidth, adapting the smoothness and expressivity to each coordinate instance.

2. Joint Objective: Data Fit and Graph-Laplacian Smoothness

The embedding is trained alongside an MLP $f_\theta$ using a combined surrogate loss composed of a conventional data-fitting term and a graph-Laplacian regularizer. For input-output pairs $(x_i, y_i)$ :

Define $u_i = \|J_f(x_i)\|_F$ as the Frobenius norm of the input Jacobian of $f_\theta$ at $x_i$ .
The loss function: $\min_{\theta, \sigma} \quad L_{\text{data}}(\theta, \sigma) + \gamma\, \bar{\tau}(u;\sigma)$ where

$L_{\text{data}} = \frac{1}{n} \sum_{i=1}^n \ell\Big(f_\theta(\phi(x_i; \sigma_{x_i})), y_i \Big)$

is the empirical data loss, and

$\bar{\tau}(u; \sigma) = u^T L(\sigma) u - \lambda \|A(\sigma)\|_F$

is the Laplacian regularizer with an “anti-collapse” term. Here, $L(\sigma)$ is the graph Laplacian over embedded coordinates and $A(\sigma)$ the adjacency matrix. Hyperparameters $\gamma, \lambda > 0$ govern the trade-off and collapse prevention.

This formulation encourages local smoothness of target Jacobians across the learned manifold, effectively matching manifold volume elements to the model’s functional complexity at each location.

3. Graph Laplacian Construction and Smoothness Enforcement

The graph structure is imposed over the set of embedded coordinates $\{\phi(x_i)\}$ by defining an adjacency matrix $A \in \mathbb{R}^{n\times n}$ : $w_{ij} = (\rho_i \rho_j)^{-\lambda_0} \exp\left(-\frac{\|\phi(x_i) - \phi(x_j)\|^2}{2 \epsilon^2}\right)$

$\rho_i = \sum_{j=1}^n \exp\left(-\frac{\|\phi(x_i)-\phi(x_j)\|^2}{2\epsilon^2}\right)$

where $\epsilon, \lambda_0$ control smoothing. The (unnormalized) Laplacian is $L = D - A$ , for $D$ the diagonal degree matrix.

Minimization of $u^T L u$ enforces that embedded points close in the positional manifold exhibit similar local functional complexity, measured by Jacobian norms. The continuous analogue aligns the embedding’s metric determinant with the network’s Jacobian norm, regulating the positional manifold’s distortion.

4. Two-Stage Optimization: Decoupling Embedding and Network Fitting

The optimization proceeds in two decoupled stages:

Stage I: Embedding hyperparameters ( $\{\sigma_{x_i}\}$ ) are fit by minimizing $\bar\tau(u;\sigma)$ using gradient descent, with gradients propagating through $\phi(x; \sigma)$ into the Laplacian and adjacency matrices. To circumvent $O(n^2)$ scaling, the mapping from input gradient norms to $\sigma_x$ is approximated by a low-degree polynomial:

$\sigma_x = \sum_{k=1}^K \beta_k \|\partial \bar{f}(x)/\partial x\|_2^{k-1}$

where coefficients $\beta_k$ are learned via least-squares on a small calibration set and can be reused for new functions without re-search.

Stage II: With $\sigma_{x_i}$ fixed, $\phi(\cdot; \sigma)$ becomes a deterministic embedding. The MLP $f_\theta$ is then trained by standard stochastic optimization (e.g., Adam) to minimize data error, using fixed embedding parameters.

This staged decoupling prevents overfitting observed in joint end-to-end optimization. The procedure yields robust generalization by separating embedding smoothness induction from function approximation.

5. Comparative Evaluation Against Random Fourier Features

Quantitative and qualitative evaluation demonstrates significant improvements of soft-position embedding over RFF:

Task	Test PSNR (RFF)	Test PSNR (Soft-Embedding)	SSIM (RFF)	SSIM (Soft-Embedding)
1D Signals	~26 dB	~31 dB	–	–
2D Images	–	+2–3 dB relative gain	–	–
3D NeRF Scenes	–	+1 dB; .947 → .981	.947	.981

RFF requires per-instance frequency tuning; the soft-position approach attains superior results with universal coefficients.
RFF fails under permutation/undersampling of its frequencies, while adaptive per-instance $\sigma_x$ confers robustness.
Shallow MLPs exhibit higher signal fidelity with the smooth embeddings, highlighting the ease of function approximation on smooth positional manifolds.

6. Gradient Stability and Intermediate Integration

Empirical results reveal that, when back-propagating through $\phi$ , the super-Gaussian manifold generates stable, low-variance gradients $\partial\phi/\partial x$ even under coordinate permutations or deep stacking. In contrast, RFF induces noisy, ill-conditioned gradient fields. Consequently, the embedding can be reliably inserted as an intermediate layer (e.g., at a U-Net bottleneck) without risk of gradient explosion or collapse. This stability is critical for scalable integration of positional encodings in deep architectures.

7. High-Level Implementation and Practical Considerations

Implementation follows a two-stage recipe:

Polynomial Fitting for $\sigma(x)$
- Select a calibration subset $S$ and estimate gradient magnitudes $G_j$ .
- Optimize $\sigma_j$ for $S$ via gradient descent on the Laplacian loss.
- Perform least-squares fit for polynomial coefficients $\beta$ mapping $G_j \mapsto \sigma_j$ .
Training the MLP $f_\theta$ with Fixed Embedding
- Compute $G_i$ and derive $\sigma_{x_i}$ for all training points.
- Generate $\phi(x_i)$ for each input using the fitted $\sigma_{x_i}$ .
- Train $f_\theta$ to minimize the data loss over the embedded inputs via Adam or SGD.

Optional iterative updates of $\beta$ alongside network training may further refine the embedding, although the two-stage methodology generally suffices for strong performance.

Summary

Soft-position embedding employs a super-Gaussian coordinate-to-feature mapping with per-instance bandwidth, learned under a graph-Laplacian smoothness prior. This architecture optimally balances local detail and global smoothness, exceeds RFF in fidelity and robustness, and is suitable for integration within deep neural networks without laborious hyperparameter search or gradient instability (Ramasinghe et al., 2021).

Markdown Report Issue Upgrade to Chat

References (1)

Learning Positional Embeddings for Coordinate-MLPs (2021)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Soft-Position Embedding.

Soft-Position Embedding in Coordinate-MLPs

1. Super-Gaussian Instance-specific Embedding Construction

2. Joint Objective: Data Fit and Graph-Laplacian Smoothness

3. Graph Laplacian Construction and Smoothness Enforcement

4. Two-Stage Optimization: Decoupling Embedding and Network Fitting

5. Comparative Evaluation Against Random Fourier Features

6. Gradient Stability and Intermediate Integration

7. High-Level Implementation and Practical Considerations

Summary

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

Soft-Position Embedding in Coordinate-MLPs

1. Super-Gaussian Instance-specific Embedding Construction

2. Joint Objective: Data Fit and Graph-Laplacian Smoothness

3. Graph Laplacian Construction and Smoothness Enforcement

4. Two-Stage Optimization: Decoupling Embedding and Network Fitting

5. Comparative Evaluation Against Random Fourier Features

6. Gradient Stability and Intermediate Integration

7. High-Level Implementation and Practical Considerations

Summary

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research