Minimum-Norm Interpolating Solutions

Updated 3 October 2025

Minimum-Norm Interpolating Solutions are methods that select the smoothest or simplest function among all exact interpolants by minimizing a chosen norm.
They are widely used in approximation theory, signal processing, and machine learning to enable implicit regularization and enhance model generalization.
Efficient algorithmic approaches in both finite and infinite-dimensional settings ensure stability and practical recovery in high-dimensional and structured data problems.

Minimum-norm interpolating solutions constitute a class of solutions to interpolation and regression problems that, among all possible functions or parameters that fit the data exactly (i.e., interpolate), minimize a specified (semi-)norm. These solutions provide foundational tools across approximation theory, statistics, signal processing, computational geometry, machine learning, and optimization. The minimum-norm criterion serves as an implicit regularizer, influences the implicit bias of optimization algorithms, and often underlies the theoretical understanding of overparameterized or interpolating models.

1. Fundamental Formulation and Types of Minimum-Norm Interpolation

The minimum-norm interpolation problem—across both finite- and infinite-dimensional settings—seeks a function or vector subject to interpolation constraints that has minimal norm with respect to a chosen functional or vector norm. In the most basic linear setting, given a system $Ax = b$ with $A \in \mathbb{R}^{m \times n}$ and $m < n$ , the minimum $\ell_2$ -norm solution is

$x^{\dagger} = \operatorname{argmin} \{ \|x\|_2 : Ax = b \}$

which is given by $x^{\dagger} = A^T (AA^T)^{-1} b$ . In an infinite-dimensional Hilbert space (such as an RKHS), the minimum-norm interpolant for data $(x_j, y_j)_{j=1}^n$ is characterized as

$\min_{f \in \mathcal{H}_K} \|f\|_K \quad \text{s.t.} \quad f(x_j) = y_j \ \forall j,$

which, by the representer theorem, reduces to a sum of kernels, $f^*(\cdot) = \sum_{j=1}^n c_j K(\cdot, x_j)$ with $K$ the reproducing kernel.

Variants arise for other norms (e.g., minimum $\ell_1$ -norm, group Lasso, or nuclear norm for structured parameter recovery; minimum Sobolev norm for bandlimiting or smoothness control; and minimum semi-norm for splines).

2. Minimum-Norm Interpolation in Hilbert and Banach Spaces

In reproducing kernel Hilbert spaces (RKHSs), minimum-norm interpolation admits closed-form characterizations, is stable, and optimal in both approximation and statistical senses. The norm-minimizing property selects a unique solution among the infinitely many interpolants, typically the smoothest or simplest in the RKHS norm. Explicit representer theorems guarantee that the minimizer always lives in the span of the representers of the interpolation functionals.

In Banach spaces (including, e.g., $\ell_1$ , $L^1$ , or more general function spaces), solution structure is richer and subtler. The generalization of the representer theorem shows that, while explicit formulae may be unavailable, the solution's dual certificate can still be represented as a finite combination of interpolation functionals. This structure enables reduction to finite-dimensional nonlinear systems or fixed-point equations and provides a pathway for algorithmic realization, such as fixed-point schemes for $\ell_1(\mathbb{N})$ (Wang et al., 2020).

The table below summarizes common minimum-norm interpolation settings:

Space	Norm	Solution Structure (if explicit)
$\ell_2^n$	$\\|\cdot\\|_2$	Moore-Penrose pseudoinverse
RKHS	$\\|\cdot\\|_{\mathcal{H}_K}$	Kernel expansion over data
$\ell_1^n$	$\\|\cdot\\|_1$	Basis pursuit; convex optimization
Sobolev	Semi-norm, e.g., $\int \|f^{(m)}\|^2$	Spline with Green’s function
Banach ( $\ell_p$ )	$\\|\cdot\\|_p$	Subdifferential/fixed-point formulation

3. Sparse and Structured Minimum-Norm Interpolation

Minimum-norm solutions with respect to norms promoting structure (e.g., sparsity or low-rankness) are fundamental in high-dimensional statistics and signal processing. For example, for an underdetermined linear system with a sparse solution, the minimum $\ell_1$ -norm interpolator (basis pursuit) is

$\min \|x\|_1 \quad \text{s.t.} \quad Ax = b.$

Compressed sensing theory shows that under RIP-type conditions and when the measurement matrix is suitably incoherent, this solution coincides with the sparsest one, even for $m \ll n$ . The results generalize: group Lasso and nuclear norm interpolators minimize over group-structured or low-rank variables, respectively, and provide order-optimal recovery rates under overparameterization (Chinot et al., 2020, Wang et al., 2021).

In derivative-free optimization (DFO), minimum-norm interpolators with respect to the $\ell_1$ -norm on the quadratic part of a polynomial model automatically detect sparsity in the Hessian, providing highly sample-efficient and accurate local models and yielding superior function evaluation efficiency compared to minimum $\ell_2$ -norm or Frobenius-norm approaches (Bandeira et al., 2013).

4. Algorithmic Approaches and Computational Complexity

Efficient computation of minimum-norm interpolating solutions is essential for large-scale problems in coding theory, symbolic computation, and error-correcting code decoding. For polynomial and rational interpolation, fast deterministic, divide-and-conquer algorithms based on minimal (Popov) bases and shift-reduced structures achieve nearly optimal complexity of $O(m^{\omega-1} (\sigma + |s|))$ field operations, where $m$ is block size, $\omega$ the exponent of matrix multiplication, and $\sigma, s$ degree and shift parameters (Jeannerod et al., 2015).

In distributed or decentralized optimization, consensus protocols yield distributed minimum-norm interpolators (notably, minimum $\ell_1$ -norm solutions, relevant to compressed sensing and sparse recovery) in finite time by orchestrating subgradient flows across network agents (Zhou et al., 2017).

For kernel and spline-based settings, inversion or pseudo-inverses of the kernel/interpolation matrices (or their stabilized variants) produce the minimum-norm solution directly; the solution's (semi-)norm can be interpreted as a risk proxy for generalization or as the maximal amplification factor for interpolation error (Rangamani et al., 2020, Nevskii, 2023).

5. Generalization, Implicit Bias, and Modern Machine Learning

Minimum-norm interpolators are central to the understanding of generalization in modern overparameterized and interpolation-based regimes:

Benign Overfitting: In high-dimensional linear models, minimum-norm interpolators (both $\ell_2$ and $\ell_1$ ) can generalize well despite fitting noise, provided appropriate spectral or structural conditions—such as sparse ground truth or low effective rank—are met (Bunea et al., 2020, Wang et al., 2021).
Double/Multiple Descent and Implicit Regularization: The risk curve as a function of model capacity may exhibit double descent (for $\ell_2$ ) or multiple descent (for $\ell_1$ ), revealing intricate interactions between overparameterization, norm-induced inductive bias, and the interpolation threshold (Li et al., 2021).
Implicit Bias of Optimization Algorithms: Gradient descent and its variants on overparameterized models (linear models, shallow ReLU networks) often converge to minimum-norm interpolants (e.g., in the Barron space for ReLU networks), even absent explicit regularization (Park et al., 2023, Vaswani et al., 2020). This “implicit regularization” explains the selection of smooth/simple solutions and underlies observed generalization behavior.
Statistical and Approximation-Theoretic Explanations: For kernel interpolators, the minimum-norm solution is shown to minimize cross-validated error bounds (e.g., LOO stability), with the risk and stability tightly controlled by the norm and the condition number of the kernel matrix, resulting in generalization bounds that plateau as the number of parameters increases beyond the sample size (Rangamani et al., 2020, Li, 2020). In deterministic approximation, minimum weighted norm interpolants are shown to converge towards kernel interpolants as the parameterization space is expanded, clarifying the implicit bias in overparameterized regimes (Li, 2020).

6. Geometric and Approximation-Theoretic Aspects

The geometric configuration of interpolation points can influence the operator norm or worst-case amplification factor of linear interpolation projectors. For instance, in the context of interpolating continuous functions on a Euclidean ball using linear functions, the minimal operator norm is achieved when nodes are the vertices of a regular simplex inscribed in the boundary sphere. The optimal norm is given explicitly by a function involving the dimension, with sharp lower and upper bounds, providing precise guidance for stable scheme design (Nevskii, 2023).

In spline theory and function approximation (including minimum Sobolev norm interpolation and variational splines), the norm minimization principle yields interpolants with optimal smoothness and stability, often possessing precise reproducing or orthogonality properties with respect to the kernel or polynomial subspace involved (Hayotov, 2014, Chandrasekaran et al., 2017, Vlachkova, 2019).

7. Practical Applications and Future Directions

Minimum-norm interpolating solutions underpin algorithms in a variety of domains:

Derivative-Free and Combinatorial Optimization: Automatic recovery of sparsity/piecewise structure in polynomial models from few function values.
Signal and Image Processing: Minimum-norm (often $\ell_1$ -based) recovery algorithms in compressed sensing, sparse coding, and super-resolution.
Computational Geometry and Point Cloud Processing: Implicit surface reconstruction from raw point clouds via kernel-based and KAN-inspired minimum-norm schemes yields superior normal and surface estimation compared to classical RBF and Hermite approaches (Chu et al., 11 Jul 2025).
Machine Learning and High-Dimensional Statistics: Minimum-norm kernel and spline interpolants, deep learning models' implicit bias, and robust estimation in adversarial environments benefit from norm-minimization frameworks.

Active research directions include:

Analyzing fine-grained behaviors (e.g., triple- or multi-descent risk curves), algorithmic stability, and conditions under which minimum-norm interpolation yields benign overfitting or consistency.
Extending explicit computational methods to broad Banach space settings.
Leveraging minimum-norm interpolation in semi-supervised, structured-data, and manifold-regularized learning paradigms.
Designing practical, scalable algorithms for minimum-norm interpolation in large-scale, possibly distributed computing environments.

Together, these developments demonstrate the centrality of minimum-norm interpolating solutions in classical approximation theory, contemporary high-dimensional learning, and the modern understanding of implicit regularization and generalization in overparameterized models.