Deep Parameter Interpolation (DPI)

Updated 29 November 2025

Deep Parameter Interpolation (DPI) is a technique that creates continuous neural network models by interpolating between endpoint parameter sets.
It enables scalar conditioning and smooth transitions, optimally balancing trade-offs in applications such as image smoothing and accelerated MRI reconstruction.
DPI integrates seamlessly with diverse architectures without structural changes, facilitating resource-efficient and expressive model generation.

Deep Parameter Interpolation (DPI) is a general methodology that constructs a continuous or parameterized family of neural networks by affine interpolation or extrapolation in parameter space between two or more endpoint models. DPI has emerged as a practical, architecture-agnostic tool for generating continuous transformations between learned neural operators, for imparting scalar conditioning to deep networks, and for facilitating resource-efficient model generation for domains such as image processing, medical reconstruction, and generative modeling (Zhao et al., 2020, Qin et al., 2020, Park et al., 26 Nov 2025). By exploiting linear or monotonic paths in weight space, DPI enables dense traversals between network behaviors, scalar-continuous control, and access to model families not obtainable by isolated training.

1. Mathematical Foundations of Deep Parameter Interpolation

The essential operation in DPI is linear interpolation (or its extrapolative generalizations) between sets of network parameters. Suppose two endpoint models of identical architecture are given by parameter sets $W_A$ and $W_B$ ; DPI defines, for any scalar $\alpha\in\mathbb{R}$ , an instantiation: $W^{(i)}(\alpha) = (1-\alpha) W_A^{(i)} + \alpha W_B^{(i)}$ for all layers $i$ (Zhao et al., 2020). For $0\leq \alpha\leq 1$ , this recovers the standard linear interpolation; allowing $\alpha<0$ or $\alpha>1$ yields forward and backward extrapolations, supporting functional families beyond the immediate segment determined by the anchor models.

Generalizations in DPI include monotonic interpolation schedules parameterized by learnable functions $\lambda(s)$ , so that for a scalar conditioning variable $s$ , the interpolated parameters at $s$ are: $\Theta(s) = (1-\lambda(s))\,\Theta^{(0)} + \lambda(s)\,\Theta^{(1)}$ where $\lambda(s)$ is constrained to be monotonic and maps $s_\mathrm{min}$ to $0$ and $s_\mathrm{max}$ to $1$ (Park et al., 26 Nov 2025). Special forms for $\lambda(s)$ , such as a softmax-cumulative-sum mapping, allow the network to learn the most effective schedule for interpolation across a given domain.

2. Training Regimes and Scalar Conditioning

DPI is applicable both post-hoc—after endpoint models have been trained—and as an integral part of the training process. For continuous model generation, endpoints are typically trained sequentially by fine-tuning on different label domains or task parameters; the coupling of sequential endpoint fine-tuning (e.g., A→B→A) yields highly correlated parameter sets, enabling stable and meaningful interpolations (Zhao et al., 2020).

For scalar conditioning, DPI maintains two (or more) full parameter sets within a single neural network. The effective network at scalar $s$ is constructed on-the-fly by interpolating all weights and biases, with the interpolation coefficient set adaptively by a monotonic function $\lambda(s)$ parameterized, for example, via a softmax-cum-sum construction. Training involves sampling scalar values $s$ , constructing $\Theta(s)$ , and backpropagating through the interpolation to update both parameter sets and the schedule. This strategy provides a powerful alternative to direct scalar encoding via additional input channels or FiLM-like feature modulation and allows arbitrary architectures to be augmented with scalar dependency without architectural modifications (Park et al., 26 Nov 2025).

3. Integration with Network Architectures and Algorithms

DPI operates at the parameter-tensor level: all modules—convolutional kernels, normalization parameters, biases—are duplicated (or multiplied according to the number of endpoints) and interpolated uniformly. No modifications to core backbone architectures are required, and DPI is agnostic to the detailed layer arrangements (e.g., U-Nets, DUNets, or domain-specific constructs). In specialized cases, such as for image smoothing, DPI is leveraged within architectures employing double-state aggregation (DSA) modules, yielding highly expressive operator families (Zhao et al., 2020).

Algorithmic steps for DPI generally include:

Independent or coupled training of endpoint models $A$ and $B$ on separate effect/domain data.
Interpolation (and possible extrapolation) of every parameter for each desired $\alpha$ or scalar $s$ using prescribed formulas.
Inference with the synthesized model, with no need for post-interpolation fine-tuning.

When applied to generative modeling—diffusion or flow-based—DPI cleanly substitutes the usual vector field network $f_\theta(x,s)$ with $f_{\Theta(s)}(x)$ , achieving scalar dependency without auxiliary scalar-encoding MLPs or input concatenation (Park et al., 26 Nov 2025).

4. Practical Applications and Empirical Results

DPI has been applied to a variety of tasks:

Continuous image smoothing: DPI, equipped with CEI (Concurrent Extrapolating and Interpolating) tools, can generate a dense spectrum of deep image operators spanning from one effect to another, supporting linear interpolation as well as extrapolation beyond the endpoints. Quantitatively, DPI-based DSANs (Double-State Aggregation Networks) have demonstrated higher PSNR and SSIM(D) values on challenging test splits than previous state-of-the-art CNNs for smoothing (Zhao et al., 2020).
Accelerated MRI reconstruction: In MR image reconstruction, DPI produces a continuum between a fidelity-focused reconstruction network (L1 + SSIM loss) and a GAN-augmented, perceptually-oriented one. DPI enables tailored trade-off selection, where intermediate $\alpha$ choices recover much of the objective fidelity while enhancing fine texture, as confirmed by PSNR, SSIM, and visual inspection (Qin et al., 2020).
Scalar conditioning in generative models: DPI enables U-Net and DRUNet architectures to condition natively on time or noise-level scalars required in diffusion and flow-matching models. DPI yields significant improvements in denoising MSE (6–8% reduction), FID (e.g., FID 26.23 to 12.23 on DRUNet diffusion), and sFID, outperforming fixed-embedding and MLP-based scalar-conditioning baselines (Park et al., 26 Nov 2025).

Measured computational costs show that DPI roughly doubles parameter count, but imposes negligible additional FLOPs (<1%) and increases memory only moderately (5–9%) (Park et al., 26 Nov 2025). No extra fine-tuning is required post-interpolation.

5. Architectural and Algorithmic Extensions

DPI has been generalized to allow:

Extrapolations in parameter space: Piecewise-linear extensions support network instantiations "beyond" the trained endpoints, enabling the synthesis of models with effects not directly represented in training data (Zhao et al., 2020).
Learnable interpolation schedules: Adaptive, monotonic schedules $\lambda(s)$ discovered during training enhance performance, consistently producing lower objectives than fixed linear schedules. Regularization strategies (e.g., entropy penalties) are proposed to maintain smoothness and prevent degenerate jumps in $\lambda(s)$ (Park et al., 26 Nov 2025).
Integration with double-state aggregation modules: In DPI-based DSANs, DSA modules aggregate local and non-local features between endpoint streams, structurally enforcing the capability to span a richer space of image transformations (Zhao et al., 2020).

DPI is distinct from standard deep network interpolation (DNI), which typically supports only simple convex interpolation between two independently trained models. DPI supports extrapolation, dynamic schedule learning, and can generate model families with enhanced diversity of expressivity—all without separate training for each desired operator (Zhao et al., 2020, Park et al., 26 Nov 2025).

Compared with FiLM- or Adaptive Feature Modulation schemes (PAC, AdaFM, CFSNet), DPI eliminates the need for per-layer adapters or explicit modulation, instead learning the full parameter-to-parameter mappings. The approach is, however, associated with a doubling in parameter count and carries a risk of overfitting on limited data, which may be mitigated by weight decay or early stopping (Park et al., 26 Nov 2025).

A limitation is the assumption of identical architectures and compatible parameterizations for interpolation; extensions to nonlinear or learned parameter paths are currently underexplored. Future research directions include automating the choice of interpolation/extrapolation coefficients based on task-specific metrics, and extending DPI to multi-endpoint or higher-dimensional interpolation polytopes.

7. Practical Guidelines

Initialization: Endpoint parameters should be initialized identically (e.g., Kaiming uniform) to ensure unbiased interpolation (Park et al., 26 Nov 2025).
Learning rates: Use lower learning rates for core weights and higher for interpolation schedules.
Debugging: Inspect the smoothness and monotonicity of $\lambda(s)$ , and verify that interpolated parameters produce stable activations.
Hyperparameter selection: For trade-off applications, initial exploration at $\alpha=0.5$ is suggested, adjusting as needed based on downstream fidelity or perceptual requirements (Qin et al., 2020).

In summary, DPI provides a principled and practical approach for generating continuous, scalar-parameterized families of neural networks, enabling enhanced flexibility, continuous control, and improved empirical performance in a range of modern learning systems (Zhao et al., 2020, Qin et al., 2020, Park et al., 26 Nov 2025).