Papers
Topics
Authors
Recent
Search
2000 character limit reached

Feature-Adaptive Implicit Neural Representations

Updated 17 November 2025
  • FA-INR is a neural architecture that integrates data-driven feature adaptivity into implicit neural representations through dynamic activation modulation and memory-based feature retrieval.
  • It employs techniques such as Incode-style adaptive sine activations and cross-attention with Mixture-of-Experts routing to leverage local data complexity for optimal reconstruction fidelity.
  • Empirical results demonstrate state-of-the-art performance in PSNR, SSIM, and IoU across audio, imaging, and scientific simulations, albeit with increased computational overhead.

Feature-Adaptive Implicit Neural Representation (FA-INR) refers to a class of neural architectures in which the implicit mapping of coordinates (and potentially auxiliary parameters) to signal values is dynamically conditioned on intermediate or global features, enabling model capacity to adapt flexibly to local data complexity. FA-INR methods depart from conventional implicit neural representations (INRs)—where parameters and activation functions are fixed—by introducing explicit mechanisms for feature-driven modulation of either neural activations or feature retrieval, often resulting in superior reconstruction fidelity and parameter efficiency across modalities such as audio, imaging, and scientific simulation.

1. Definitional Scope and Taxonomy

FA-INR encompasses architectures that, during inference, modulate either their intermediate activations or their internal feature representations based on data-adaptive context. According to the taxonomy of INRs (Essakine et al., 2024), feature adaptivity is realized either by predicting the parameters of activation functions using internal statistics (“Incode” style) or by using additional adaptive mechanisms such as cross-attention over feature memory banks governed by Mixture-of-Experts (MoE) routing (Li et al., 7 Jun 2025). Distinct from basic INRs employing static positional encodings or fixed nonlinearities, FA-INR approaches incorporate feature-driven conditioning at key model stages, either intra-layer (activation modulation) or as part of hierarchical representation retrieval (memory attention + routing).

2. Core Methodologies and Architectures

2.1 Activation Modulation: Incode-style Feature Adaptivity

In the “Incode” approach (Essakine et al., 2024), each layer’s activation function is not static but receives parameters from a learned, feature-processing network (“harmoniser”), whose input is a summary of the previous layer’s activations. For coordinate xRdx \in \mathbb{R}^d, and hidden activations yi1y_{i-1}, layer ii computes:

fIncode:yi=σi(Wiyi1+bi)f_{\text{Incode}}: \qquad y_i = \sigma_i(W_i y_{i-1} + b_i)

σi(u)=aisin(biωu+ci)+di\sigma_i(u) = a_i \sin(b_i \omega u + c_i) + d_i

where (ai,bi,ci,di)=Hi(fi)(a_i, b_i, c_i, d_i) = H_i(f_i) and fif_i is a pooled summary (e.g., global average) of yi1y_{i-1}. HiH_i is typically a compact MLP, outputting per-layer, per-sample sine amplitude, frequency, phase, and bias. All components are jointly trained to minimise a suitable reconstruction loss (e.g., L2L_2).

2.2 Memory-Augmented Adaptive Representations

In scientific surrogate modeling (Li et al., 7 Jun 2025), FA-INR augments the standard INR mapping yi1y_{i-1}0 with a learnable key–value memory bank and cross-attention feature retrieval, optionally routed via a coordinate-driven MoE:

  • Spatial Encoder: yi1y_{i-1}1 via an MLP.
  • Memory Query: yi1y_{i-1}2.
  • Memory Bank: yi1y_{i-1}3, yi1y_{i-1}4; both learned.
  • Parameter Conditioning: Simulation parameters yi1y_{i-1}5 are embedded, merged (via elementwise product), and used to adapt yi1y_{i-1}6 by a residual adapter MLP.
  • Cross-Attention:

    yi1y_{i-1}7

  • Mixture-of-Experts Routing: A low-resolution spatial grid and gating MLP produce expert probabilities; for each yi1y_{i-1}8, Top-2 experts are selected and their outputs aggregated.

This design allows the system to allocate model capacity adaptively, focusing on spatial regions or parameter subspaces that exhibit higher local complexity.

3. Algorithmic Descriptions and Key Formulas

3.1 Incode: Feature-Adaptive Sine Activation

For each hidden layer yi1y_{i-1}9: fif_i9 ii0 is a global frequency hyperparameter. ii1 is implemented as a small MLP. All parameters are end-to-end learned.

3.2 FA-INR with Cross-Attention and MoE

Given ii2, ii3, and a set of ii4 experts, the feature retrieval steps are:

  1. Encode ii5 (as above).
  2. Embed ii6 and adapt ii7 (value adapter) via a 2-layer MLP.
  3. For each of the Top-2 selected experts (from the gating network), perform projected cross-attention to obtain ii8.
  4. Aggregate feature vectors ii9.
  5. Decode to output value fIncode:yi=σi(Wiyi1+bi)f_{\text{Incode}}: \qquad y_i = \sigma_i(W_i y_{i-1} + b_i)0 (3-layer ReLU MLP).

4. Empirical Performance and Trade-offs

4.1 Incode (Feature-Adaptive Sine Activation Network)

  • 1D Audio Reconstruction: Second-lowest fIncode:yi=σi(Wiyi1+bi)f_{\text{Incode}}: \qquad y_i = \sigma_i(W_i y_{i-1} + b_i)1 error; sharper reconstructions versus non-adaptive baselines at equal iteration counts.
  • 2D CT Reconstruction: Best PSNR and SSIM at all projection counts (e.g., fIncode:yi=σi(Wiyi1+bi)f_{\text{Incode}}: \qquad y_i = \sigma_i(W_i y_{i-1} + b_i)2 projections: fIncode:yi=σi(Wiyi1+bi)f_{\text{Incode}}: \qquad y_i = \sigma_i(W_i y_{i-1} + b_i)3 dB, fIncode:yi=σi(Wiyi1+bi)f_{\text{Incode}}: \qquad y_i = \sigma_i(W_i y_{i-1} + b_i)4; fIncode:yi=σi(Wiyi1+bi)f_{\text{Incode}}: \qquad y_i = \sigma_i(W_i y_{i-1} + b_i)5 projections: fIncode:yi=σi(Wiyi1+bi)f_{\text{Incode}}: \qquad y_i = \sigma_i(W_i y_{i-1} + b_i)6 dB, fIncode:yi=σi(Wiyi1+bi)f_{\text{Incode}}: \qquad y_i = \sigma_i(W_i y_{i-1} + b_i)7); only surpassed by Fr at very low sampling rates.
  • Image Denoising: Highest PSNR (fIncode:yi=σi(Wiyi1+bi)f_{\text{Incode}}: \qquad y_i = \sigma_i(W_i y_{i-1} + b_i)8 dB) but longest runtime (fIncode:yi=σi(Wiyi1+bi)f_{\text{Incode}}: \qquad y_i = \sigma_i(W_i y_{i-1} + b_i)9 s); alternatives trade slight PSNR for significant speedup.
  • Super-Resolution: Leading for σi(u)=aisin(biωu+ci)+di\sigma_i(u) = a_i \sin(b_i \omega u + c_i) + d_i0 upsampling (PSNR σi(u)=aisin(biωu+ci)+di\sigma_i(u) = a_i \sin(b_i \omega u + c_i) + d_i1 dB, SSIM σi(u)=aisin(biωu+ci)+di\sigma_i(u) = a_i \sin(b_i \omega u + c_i) + d_i2, LPIPS σi(u)=aisin(biωu+ci)+di\sigma_i(u) = a_i \sin(b_i \omega u + c_i) + d_i3); nearly best for σi(u)=aisin(biωu+ci)+di\sigma_i(u) = a_i \sin(b_i \omega u + c_i) + d_i4; outperformed by Fr/Finer at extreme scales.
  • 3D Occupancy IoU: Near-best performance (σi(u)=aisin(biωu+ci)+di\sigma_i(u) = a_i \sin(b_i \omega u + c_i) + d_i5), only Finer is marginally higher.

Trade-offs: Significant computational overhead due to harmoniser networks; elevated risk of overfitting from per-layer activation freedom; lack of thorough ablations on harmoniser depth or pooling.

4.2 FA-INR with Cross-Attention and MoE (Scientific Surrogates)

  • MPAS-Ocean dataset (σi(u)=aisin(biωu+ci)+di\sigma_i(u) = a_i \sin(b_i \omega u + c_i) + d_i6M params, σi(u)=aisin(biωu+ci)+di\sigma_i(u) = a_i \sin(b_i \omega u + c_i) + d_i7 experts): PSNR σi(u)=aisin(biωu+ci)+di\sigma_i(u) = a_i \sin(b_i \omega u + c_i) + d_i8 dB, SSIM σi(u)=aisin(biωu+ci)+di\sigma_i(u) = a_i \sin(b_i \omega u + c_i) + d_i9, MD (ai,bi,ci,di)=Hi(fi)(a_i, b_i, c_i, d_i) = H_i(f_i)0; outperforms grid-based and MLP-based methods by wide margins in both accuracy and parameter efficiency.
  • Scaling: Increasing number of experts from (ai,bi,ci,di)=Hi(fi)(a_i, b_i, c_i, d_i) = H_i(f_i)1 leads to PSNR (ai,bi,ci,di)=Hi(fi)(a_i, b_i, c_i, d_i) = H_i(f_i)2 dB; diminishing returns beyond (ai,bi,ci,di)=Hi(fi)(a_i, b_i, c_i, d_i) = H_i(f_i)3. Model size remains (ai,bi,ci,di)=Hi(fi)(a_i, b_i, c_i, d_i) = H_i(f_i)4M parameters, versus (ai,bi,ci,di)=Hi(fi)(a_i, b_i, c_i, d_i) = H_i(f_i)5M for grid-based models.
  • Ablations: Memory bank + MoE architecture yields (ai,bi,ci,di)=Hi(fi)(a_i, b_i, c_i, d_i) = H_i(f_i)6 dB gain vs. rigid grids/planes at equal parameter count; Top-2 routing outperforms dense or concatenative aggregation.
  • Efficiency: FA-INR trains (ai,bi,ci,di)=Hi(fi)(a_i, b_i, c_i, d_i) = H_i(f_i)7 slower than grid/plane-INRs but at much greater data-efficiency; much faster than running original scientific simulators.

5. Comparative Analysis and Best Practices

FA-INR architectures provide superior adaptivity compared to traditional INR variants reliant on fixed grids, positional encodings, or static activation functions. Key implementation practices for state-of-the-art performance include:

Component Best Practices Typical Settings
Memory bank size (ai,bi,ci,di)=Hi(fi)(a_i, b_i, c_i, d_i) = H_i(f_i)8 ((ai,bi,ci,di)=Hi(fi)(a_i, b_i, c_i, d_i) = H_i(f_i)9 points); fif_i0 (fif_i1) fif_i2
Encoder/Decoder Encoder: 1–4 layers, sine activation; Decoder: 3 layers, ReLU fif_i3
Routing Top-2 MoE routing, grid resolution fif_i4 -
Optimizer Adam, initial lr fif_i5 (grid models); fif_i6 (MLP baselines) -

Rigid structural assumptions (such as feature grids or planes) are supplanted by data-adaptive interpolation and routing, yielding a new Pareto frontier of accuracy versus parameter cost. Explicit parameter conditioning adapters are crucial: ablating them reduces PSNR by fif_i7 dB.

6. Limitations and Future Research Directions

Despite state-of-the-art results, FA-INR architectures entail several limitations:

  • Computational cost: Harmoniser networks and cross-attention introduce overhead, increasing training and inference latency; Incode and memory-augmented MoE FA-INRs are the slowest candidates among current techniques.
  • Overfitting: Greater flexibility (adaptive sine parameters or adaptive memory routing) introduces additional degrees of freedom, raising overfitting risk, particularly on small or noisy tasks.
  • Ablation uncertainty: There is no consensus on optimal harmoniser or gating grid architecture; further ablations are required to explore trade-offs among expressivity, overhead, and regularization.

Proposed improvement directions include:

  • Streamlining harmonisers (parameter sharing, lightweight attention mechanisms);
  • Hybrid models integrating positional encodings and adaptive fusion;
  • Regularization on adaptive parameters (e.g., fif_i8 or attention outputs);
  • Dynamic bandwidth scheduling for scale-adaptive allocation of model capacity.

A plausible implication is that future FA-INR architectures will combine the compactness of memory-augmented cross-attention, the spectral flexibility of explicit activation modulation, and scalable, regularized routing strategies to further advance the resolution/efficiency trade-off in implicit neural representations.

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Feature-Adaptive Implicit Neural Representation (FA-INR).