Papers
Topics
Authors
Recent
2000 character limit reached

Local Linearization Projection

Updated 29 November 2025
  • Local linearization-based projection approximates nonlinear objects by using first-order Taylor expansions to simplify complex projections.
  • It is applied in nonconvex optimization, constrained feasibility, and manifold methods, ensuring efficient computation and reliable convergence.
  • The method underpins statistical learning and visualization techniques, achieving practical accuracy and scalability in high-dimensional settings.

Local linearization-based projection refers to a class of methods that approximate geometric objects (sets, functions, manifolds) or algorithmic steps (projection, gradient, inference) by their first-order (linear or affine) Taylor expansions at or near a current iterate, for the purpose of efficient computation or improved tractability. This paradigm is foundational across projection algorithms for nonconvex feasibility, optimization with nonlinear constraints, modern manifold methods in numerical analysis, local smoothing in nonparametric statistics, and local structure-preserving dimensionality-reduction. The approach replaces computationally intractable or nonlinear projection operations with projections onto locally linearized or affine approximations, leading to sharply improved efficiency while preserving attractive convergence or estimation properties in the local regime.

1. General Mathematical Framework

Let XX be a finite-dimensional Euclidean space, and MXM\subseteq X a set defined by nonlinear constraints, a nonlinear manifold, or a nonlinear mapping F:XYF:X\to Y. The standard nearest-point projection onto MM is, in general, nonconvex and computationally hard to compute: PM(z)=argminxMxz.P_M(z) = \operatorname*{argmin}_{x\in M} \|x-z\|\,. Local linearization-based projection approximates MM near a nominal point zz by its first-order Taylor expansion:

  • If M={x:G(x)0,H(x)=0}M = \{x: G(x) \le 0,\, H(x) = 0\} with C2C^2 data, linearize active constraints at zz:

G(z)+G(z)(xz)0,H(z)+H(z)(xz)=0.G(z) + \nabla G(z)(x-z) \le 0,\quad H(z) + \nabla H(z)(x-z) = 0\,.

  • For M={F(x):xU}M = \{F(x): x\in U\}, linearize the chart: F(x)F(z)+F(z)(xz)F(x) \approx F(z) + \nabla F(z)(x-z).

The local projection is then defined by solving a convex quadratic program (QP) or least-squares, imposing only the linearized constraints or manifold tangency: Φ(z)=argminxX12xz2  s.t.  linearized constraints at z.\Phi(z) = \operatorname*{argmin}_{x\in X} \frac{1}{2} \|x-z\|^2 \;\text{s.t.}\; \text{linearized constraints at } z. This inexact projection operator Φ\Phi admits second-order proximity to PM(z)P_M(z) as zxˉz\to \bar x under standard regularity (e.g., linear-independence constraint qualification) (Drusvyatskiy et al., 2018).

2. Alternating Projections with Local Linearization

Alternating projections seek a point in QMQ\cap M for closed sets Q,MXQ,M\subset X by iteratively projecting onto each set. If MM is nonconvex, exact PMP_M is generally impractical. Linearization-based projection circumvents this by using Φ\Phi from the previous section as a surrogate. The canonical scheme is:

  1. Start z0Qz^0\in Q near xˉQM\bar x\in Q\cap M.
  2. For k=0,1,k=0,1,\dots until convergence:
    • xk+1Φ(zk)x^{k+1} \gets \Phi(z^k) (local linearized projection toward MM)
    • zk+1PQ(xk+1)z^{k+1} \gets P_Q(x^{k+1}).

Under prox-regularity of QQ, smoothness and LICQ for MM, and transversality NQ(xˉ)(NM(xˉ))={0}N_Q(\bar x)\cap(-N_M(\bar x)) = \{0\}, this algorithm converges linearly locally to a point in QMQ\cap M, with rate controlled by the cosine of the minimal angle between the normal cones (Drusvyatskiy et al., 2018). This is robust to the inexactness of using linearized projections, as the error Φ(z)PM(z)=O(dM(z)2)\|\Phi(z)-P_M(z)\| = O(d_M(z)^2) vanishes faster than the linear contraction.

In the special case where MM is a smooth manifold parameterized by a chart FF, the local tangent-space projection is realized by a least-squares step, optionally followed by a retraction back onto MM.

3. Projected Gradient Methods with Constraint Linearization

When addressing nonlinear constrained optimization, e.g., minimizing J(z)J(z) subject to g(z)0,h(z)=0g(z)\leq 0,\, h(z)=0, projections onto the true feasible set are nonconvex and expensive. The constraint-linearization method projects the updated iterate only onto the affine-linearized constraints at the current point. Specifically, at each step:

  • Form the linearized constraint set

C(i)={zRn:g(z(i))+g(z(i))T(zz(i))0,h(z(i))+h(z(i))T(zz(i))=0}.C^{(i)} = \{z\in\mathbb{R}^n: g(z^{(i)}) + \nabla g(z^{(i)})^T (z-z^{(i)}) \le 0,\, h(z^{(i)}) + \nabla h(z^{(i)})^T (z-z^{(i)}) = 0\}.

  • Take a projected gradient step, projecting onto C(i)C^{(i)}, not the nonlinear feasible set:

zG(i)=ProjC(i)(z(i)αiJ(z(i))).z_{G}^{(i)} = \mathrm{Proj}_{C^{(i)}}(z^{(i)} - \alpha_i \nabla J(z^{(i)})).

  • Update via line search if necessary.

This is not classical projected gradient descent (since projection is onto a temporally local affine set) nor full SQP (as second-order information is omitted). Under regularity, the method converges globally to a KKT point, and locally linearly near a solution (Torrisi et al., 2016). For nonlinear model predictive control (NMPC), exploiting problem sparsity and introducing slacks for box constraints yield highly efficient implementations.

4. Local Linearization in Statistical and Machine Learning Methods

Local linearization-based projection underpins several approaches in statistics and machine learning:

  • In additive nonparametric regression, local linear smooth backfitting is recast as orthogonal projection of the response vector YY onto the additive subspace in a Hilbert space with empirical semi-norm (Hiabu et al., 2022). Each iteration alternates local projections onto component function spaces, with convergence rates matching the oracle case—OP(n2/5)O_P(n^{-2/5}) under standard assumptions.
  • For function learning, the "linearization ML" paradigm projects the data onto a globally linear (affine) space via yi=WXiy'_i = W^\top X_i, then performs prediction by local consensus among kk nearest neighbors in the 1D output space of this linear projection. This two-phase process can outperform both MLP and logistic regression on some LIBSVM datasets (Tueno, 2019). It differs from classical local linear regression by using a single global projection and only local adaptation in predictor space.

In Bayesian neural nets, the generalized Gauss-Newton (GGN) approximation is formalized as a local linearization in parameter space: f(x,θ)f(x,θ)+J(x;θ)(θθ),f(x,\theta) \approx f(x,\theta^*) + J(x;\theta^*)(\theta-\theta^*)\,, with θ\theta^* the MAP point and JJ the Jacobian. Posterior inference proceeds in the resulting Bayesian GLM, and predictive uncertainty is propagated through this linearization, which stabilizes predictions and enhances out-of-distribution detection compared to naive nonlinear parameter sampling (Immer et al., 2020).

5. Applications in Multidimensional Projection and Visualization

For dimensionality reduction (DR) and data visualization, local linearization-based projection is instrumental in understanding and mapping the deformation of high-dimensional local subspaces under possibly nonlinear projections:

  • Define local subspaces at each sample as ellipsoids from PCA of the kk-nearest neighbors.
  • The projection π:RDRd\pi: \mathbb{R}^D \to \mathbb{R}^d is often defined implicitly as the solution to a local nonlinear optimization.
  • The Jacobian J(x)=π/xJ(x) = \partial \pi/\partial x is computed analytically via the implicit function theorem, exploiting

J(x)=[2f/y2]1[2f/xy].J(x) = -[\partial^2 f/\partial y^2]^{-1}[\partial^2 f/\partial x \partial y]\,.

  • Local subspace basis directions are then mapped via vi=J(x)Viv_i = J(x) V_i, producing a visualization glyph that encodes subspace stretching or rotation (Bian et al., 2020).

Empirical results demonstrate that this approach achieves high numerical accuracy (mean angular error of 0.0050.005^\circ on synthetic planar data), and its glyph-based visualization reveals subtle global and local data structures unobservable in standard scatterplots.

6. Connections to Nonlinear Boundary Value Problems and Numerical Analysis

Local linearization-based projection generalizes to iterative methods for nonlinear boundary value problems (BVPs). For two-point BVPs, the shooting-projection iteration (SPI) method reformulates standard shooting by:

  • Given a shooting trajectory yky_k, construct a "projection" yk+1y_{k+1} as the solution to a linearized BVP

yk+1(x)=Lk(x)[yk+1(x)yk(x)]+f(x,yk(x)),yk+1(a)=α,  yk+1(b)=β,y_{k+1}'(x) = L_k(x)[y_{k+1}(x) - y_k(x)] + f(x, y_k(x)),\quad y_{k+1}(a) = \alpha,\; y_{k+1}(b) = \beta\,,

with Lk(x)L_k(x) determined via Newton, Picard, or constant-slope linearizations.

  • The procedure is a projection in function space onto the affine subspace satisfying the two boundary conditions, and yields the familiar shooting method updates (including Newton and fixed-point shooting) (Faragó et al., 2020).

Convergence rates are quadratic (Newton), or linear (Picard/constant-slope), reflecting their underlying linearization properties. The projection perspective offers a unifying explanation for the convergence and error-correction mechanisms of shooting and relaxation methods.

7. Theoretical Properties and Computational Aspects

The following table summarizes theoretical properties and complexity considerations across representative contexts:

Domain Linearized Projection Object Convergence Rate
Feasibility (AP) Polyhedral/affine set (QP/LS step) Local linear, rate cosα\approx \cos\alpha (Drusvyatskiy et al., 2018)
Constrained Opt. (NLP) Affine-linear constraint set Local linear + global (w. AL) (Torrisi et al., 2016)
Nonparametric stat. Additive subspace (semi-norm) Optimal (oracle) OP(n2/5)O_P(n^{-2/5}) (Hiabu et al., 2022)
Dim. reduction (viz) Local subspace, Jacobian map Two orders magnitude more accurate glyphs (Bian et al., 2020)
Shooting for BVP Linearized BVP operator Quadratic (Newton), linear (Picard) (Faragó et al., 2020)

Computational complexity is often dominated by small-scale QP or least-squares solves per iteration, leveraging only first derivatives. Line search or augmented Lagrangian terms ensure robustness in nonconvex contexts. In statistical applications, the block structure of projection operators facilitates scalable implementation.


Local linearization-based projection unifies a broad spectrum of algorithms in optimization, numerical analysis, statistical learning, and high-dimensional data analysis. By systematically replacing nonlinear or nonconvex projection operations with tractable linear or affine surrogates, these methods offer both practical efficiency and strong theoretical guarantees under local regularity and transversality conditions. Empirical and mathematical results across multiple domains confirm the versatility and foundational role of this approach (Drusvyatskiy et al., 2018, Torrisi et al., 2016, Tueno, 2019, Hiabu et al., 2022, Bian et al., 2020, Immer et al., 2020, Faragó et al., 2020).

Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Local Linearization-Based Projection.