Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
GPT-5.1
GPT-5.1 109 tok/s
Gemini 3.0 Pro 52 tok/s Pro
Gemini 2.5 Flash 159 tok/s Pro
Kimi K2 203 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

Feature-Adaptive Implicit Neural Representations

Updated 17 November 2025
  • FA-INR is a neural architecture that integrates data-driven feature adaptivity into implicit neural representations through dynamic activation modulation and memory-based feature retrieval.
  • It employs techniques such as Incode-style adaptive sine activations and cross-attention with Mixture-of-Experts routing to leverage local data complexity for optimal reconstruction fidelity.
  • Empirical results demonstrate state-of-the-art performance in PSNR, SSIM, and IoU across audio, imaging, and scientific simulations, albeit with increased computational overhead.

Feature-Adaptive Implicit Neural Representation (FA-INR) refers to a class of neural architectures in which the implicit mapping of coordinates (and potentially auxiliary parameters) to signal values is dynamically conditioned on intermediate or global features, enabling model capacity to adapt flexibly to local data complexity. FA-INR methods depart from conventional implicit neural representations (INRs)—where parameters and activation functions are fixed—by introducing explicit mechanisms for feature-driven modulation of either neural activations or feature retrieval, often resulting in superior reconstruction fidelity and parameter efficiency across modalities such as audio, imaging, and scientific simulation.

1. Definitional Scope and Taxonomy

FA-INR encompasses architectures that, during inference, modulate either their intermediate activations or their internal feature representations based on data-adaptive context. According to the taxonomy of INRs (Essakine et al., 6 Nov 2024), feature adaptivity is realized either by predicting the parameters of activation functions using internal statistics (“Incode” style) or by using additional adaptive mechanisms such as cross-attention over feature memory banks governed by Mixture-of-Experts (MoE) routing (Li et al., 7 Jun 2025). Distinct from basic INRs employing static positional encodings or fixed nonlinearities, FA-INR approaches incorporate feature-driven conditioning at key model stages, either intra-layer (activation modulation) or as part of hierarchical representation retrieval (memory attention + routing).

2. Core Methodologies and Architectures

2.1 Activation Modulation: Incode-style Feature Adaptivity

In the “Incode” approach (Essakine et al., 6 Nov 2024), each layer’s activation function is not static but receives parameters from a learned, feature-processing network (“harmoniser”), whose input is a summary of the previous layer’s activations. For coordinate xRdx \in \mathbb{R}^d, and hidden activations yi1y_{i-1}, layer ii computes:

fIncode:yi=σi(Wiyi1+bi)f_{\text{Incode}}: \qquad y_i = \sigma_i(W_i y_{i-1} + b_i)

σi(u)=aisin(biωu+ci)+di\sigma_i(u) = a_i \sin(b_i \omega u + c_i) + d_i

where (ai,bi,ci,di)=Hi(fi)(a_i, b_i, c_i, d_i) = H_i(f_i) and fif_i is a pooled summary (e.g., global average) of yi1y_{i-1}. HiH_i is typically a compact MLP, outputting per-layer, per-sample sine amplitude, frequency, phase, and bias. All components are jointly trained to minimise a suitable reconstruction loss (e.g., L2L_2).

2.2 Memory-Augmented Adaptive Representations

In scientific surrogate modeling (Li et al., 7 Jun 2025), FA-INR augments the standard INR mapping fθ(x,p)f_\theta(x, p) with a learnable key–value memory bank and cross-attention feature retrieval, optionally routed via a coordinate-driven MoE:

  • Spatial Encoder: xz(x)RDzx \mapsto z^{(x)} \in \mathbb{R}^{D_z} via an MLP.
  • Memory Query: q=z(x)WqRDkq = z^{(x)} W_q \in \mathbb{R}^{D_k}.
  • Memory Bank: KRM×DkK \in \mathbb{R}^{M\times D_k}, VRM×DvV \in \mathbb{R}^{M\times D_v}; both learned.
  • Parameter Conditioning: Simulation parameters pp are embedded, merged (via elementwise product), and used to adapt VV by a residual adapter MLP.
  • Cross-Attention:

    α=softmax(qK ⁣Dk)RM,z(x,p)=αV\alpha = \mathrm{softmax}\left( \frac{q K'{}^{\!\top}}{\sqrt{D_k}} \right) \in \mathbb{R}^M,\qquad z^{(x,p)} = \alpha V'

  • Mixture-of-Experts Routing: A low-resolution spatial grid and gating MLP produce expert probabilities; for each xx, Top-2 experts are selected and their outputs aggregated.

This design allows the system to allocate model capacity adaptively, focusing on spatial regions or parameter subspaces that exhibit higher local complexity.

3. Algorithmic Descriptions and Key Formulas

3.1 Incode: Feature-Adaptive Sine Activation

For each hidden layer ii:

1
2
3
4
fi = Pool(yi-1)
ai, bi, ci, di = Hi(fi)
ui = Wi @ yi-1 + bi
yi = ai * sin(bi * ω * ui + ci) + di
ω\omega is a global frequency hyperparameter. HiH_i is implemented as a small MLP. All parameters are end-to-end learned.

3.2 FA-INR with Cross-Attention and MoE

Given xx, pp, and a set of EE experts, the feature retrieval steps are:

  1. Encode xz(x)qx \rightarrow z^{(x)} \rightarrow q (as above).
  2. Embed pp and adapt VV (value adapter) via a 2-layer MLP.
  3. For each of the Top-2 selected experts (from the gating network), perform projected cross-attention to obtain zk(x,p)z_k^{(x,p)}.
  4. Aggregate feature vectors zˉ(x,p)=kTop2(x)gk(x)zk(x,p)\bar z^{(x,p)} = \sum_{k \in \mathrm{Top2}(x)} g_k(x) z_k^{(x,p)}.
  5. Decode to output value y^=fθD(zˉ(x,p))\hat y = f_{\theta_D}(\bar z^{(x,p)}) (3-layer ReLU MLP).

4. Empirical Performance and Trade-offs

4.1 Incode (Feature-Adaptive Sine Activation Network)

  • 1D Audio Reconstruction: Second-lowest L2L_2 error; sharper reconstructions versus non-adaptive baselines at equal iteration counts.
  • 2D CT Reconstruction: Best PSNR and SSIM at all projection counts (e.g., $20$ projections: $24.38$ dB, $0.651$; $300$ projections: $34.76$ dB, $0.953$); only surpassed by Fr at very low sampling rates.
  • Image Denoising: Highest PSNR ($29.63$ dB) but longest runtime ($2,370$ s); alternatives trade slight PSNR for significant speedup.
  • Super-Resolution: Leading for 2×2\times upsampling (PSNR $29.56$ dB, SSIM $0.896$, LPIPS $0.176$); nearly best for 4×4\times; outperformed by Fr/Finer at extreme scales.
  • 3D Occupancy IoU: Near-best performance ($0.99564$), only Finer is marginally higher.

Trade-offs: Significant computational overhead due to harmoniser networks; elevated risk of overfitting from per-layer activation freedom; lack of thorough ablations on harmoniser depth or pooling.

4.2 FA-INR with Cross-Attention and MoE (Scientific Surrogates)

  • MPAS-Ocean dataset ($1.10$M params, $10$ experts): PSNR $51.92$ dB, SSIM $0.9934$, MD $0.1536$; outperforms grid-based and MLP-based methods by wide margins in both accuracy and parameter efficiency.
  • Scaling: Increasing number of experts from 1101 \rightarrow 10 leads to PSNR 46.9251.9246.92 \rightarrow 51.92 dB; diminishing returns beyond E=10E = 10. Model size remains 12\sim1-2M parameters, versus $5-40$M for grid-based models.
  • Ablations: Memory bank + MoE architecture yields >10>10 dB gain vs. rigid grids/planes at equal parameter count; Top-2 routing outperforms dense or concatenative aggregation.
  • Efficiency: FA-INR trains 2030%20-30\% slower than grid/plane-INRs but at much greater data-efficiency; much faster than running original scientific simulators.

5. Comparative Analysis and Best Practices

FA-INR architectures provide superior adaptivity compared to traditional INR variants reliant on fixed grids, positional encodings, or static activation functions. Key implementation practices for state-of-the-art performance include:

Component Best Practices Typical Settings
Memory bank size M=256M = 256 (107\lesssim 10^7 points); $1024$ (5123512^3) Dk=Dv=64D_k = D_v = 64
Encoder/Decoder Encoder: 1–4 layers, sine activation; Decoder: 3 layers, ReLU Dz=128D_z = 128
Routing Top-2 MoE routing, grid resolution 16316^3 -
Optimizer Adam, initial lr 10410^{-4} (grid models); 10610^{-6} (MLP baselines) -

Rigid structural assumptions (such as feature grids or planes) are supplanted by data-adaptive interpolation and routing, yielding a new Pareto frontier of accuracy versus parameter cost. Explicit parameter conditioning adapters are crucial: ablating them reduces PSNR by 0.7\sim0.7 dB.

6. Limitations and Future Research Directions

Despite state-of-the-art results, FA-INR architectures entail several limitations:

  • Computational cost: Harmoniser networks and cross-attention introduce overhead, increasing training and inference latency; Incode and memory-augmented MoE FA-INRs are the slowest candidates among current techniques.
  • Overfitting: Greater flexibility (adaptive sine parameters or adaptive memory routing) introduces additional degrees of freedom, raising overfitting risk, particularly on small or noisy tasks.
  • Ablation uncertainty: There is no consensus on optimal harmoniser or gating grid architecture; further ablations are required to explore trade-offs among expressivity, overhead, and regularization.

Proposed improvement directions include:

  • Streamlining harmonisers (parameter sharing, lightweight attention mechanisms);
  • Hybrid models integrating positional encodings and adaptive fusion;
  • Regularization on adaptive parameters (e.g., (ai,bi,ci,di)(a_i, b_i, c_i, d_i) or attention outputs);
  • Dynamic bandwidth scheduling for scale-adaptive allocation of model capacity.

A plausible implication is that future FA-INR architectures will combine the compactness of memory-augmented cross-attention, the spectral flexibility of explicit activation modulation, and scalable, regularized routing strategies to further advance the resolution/efficiency trade-off in implicit neural representations.

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Feature-Adaptive Implicit Neural Representation (FA-INR).