Papers
Topics
Authors
Recent
Search
2000 character limit reached

Tensor-Based Proximal Alternating Minimization

Updated 7 January 2026
  • The paper introduces a novel tensor-based proximal alternating minimization method that reformulates inhomogeneous quartic optimization into a four-block multilinear problem, enabling efficient closed-form updates.
  • It employs block coordinate descent with proximal regularization, ensuring strong convexity of each subproblem and guaranteeing convergence under mild assumptions.
  • Empirical studies on Bose–Einstein condensate simulations demonstrate rapid convergence and lower computational cost compared to inexact ADMM across various discretizations.

A tensor-based proximal alternating minimization (PAM) algorithm is a numerical optimization approach developed for solving inhomogeneous quartic polynomial optimization problems on the sphere, with a structure inspired by applications such as discretized Bose–Einstein condensate (BEC) ground state computations. The method constructs an equivalence between a fourth-degree inhomogeneous polynomial minimization and a four-block multilinear optimization problem (MOP), exploiting tensor representations and block coordinate descent with proximal regularization for efficient, closed-form iterative updates. The convergence of the algorithm is established under mild assumptions, and empirical studies demonstrate notable gains compared to alternative methods such as inexact ADMM (Chen et al., 31 Dec 2025).

1. Problem Formulation and Tensor-Multilinear Equivalence

The method addresses minimization of an inhomogeneous quartic polynomial of the form: minxRn  f(x)  =  θ2i=1nxi4  +  xTBxs.t.  x=1,\min_{x\in\R^n}\; f(x)\;=\;\frac{\theta}{2}\sum_{i=1}^n x_i^4\;+\;x^T B x \quad\text{s.t.}\;\|x\|=1, where θ>0\theta > 0 and BRn×nB\in\R^{n\times n} is symmetric. Any degree-4 polynomial g(x)g(x) in nn variables can be expressed via a symmetric order-4 tensor T\mathcal{T} as g(x)=T,xxxxg(x)=\langle \mathcal{T},x\circ x\circ x\circ x\rangle. In the inhomogeneous case, the variable is lifted to x~=(1,xT)TRn+1\tilde x=(1, x^T)^T \in \R^{n+1}, and the homogenized tensor Tf§4,n+1\mathcal{T}_f\in\S^{4,n+1} encodes the full objective.

Defining a corresponding four-way multilinear function,

F(x,y,z,w)=Tf,x~y~z~w~,F(x,y,z,w) = \langle \mathcal{T}_f, \tilde x\circ\tilde y\circ\tilde z\circ\tilde w\rangle,

with each x~,y~,z~,w~Rn+1\tilde x, \tilde y, \tilde z, \tilde w \in \R^{n+1} (first entry fixed at 1, remainder from the respective variables), a key result is that, assuming a mild concavity condition on uTfu4u\mapsto\mathcal{T}_f u^4, minimization of the original quartic is equivalent to minimizing FF over four blocks constrained to the unit sphere: minx=1f(x)=minx=y=z=w=1F(x,y,z,w).\min_{\|x\|=1} f(x) = \min_{\|x\|=\|y\|=\|z\|=\|w\|=1} F(x, y, z, w). To guarantee global concavity, a shift αx~4-\alpha\|\tilde x\|^4 is introduced, with αTf\alpha \geq \|\mathcal{T}_f\|. The modified problem is

fα(x)=Tf,x~4αx~4,f_\alpha(x) = \langle \mathcal{T}_f, \tilde x^4 \rangle - \alpha \|\tilde x\|^4,

and its multilinear equivalent is

Fα(x,y,z,w)=Tf,x~y~z~w~αx~,y~z~,w~,F_\alpha(x, y, z, w) = \langle \mathcal{T}_f, \tilde x\circ\tilde y\circ\tilde z\circ\tilde w \rangle - \alpha \langle \tilde x, \tilde y \rangle \langle \tilde z, \tilde w \rangle,

with the minimizers of both formulations coinciding under these conditions (Chen et al., 31 Dec 2025).

2. Multi-Block Structure and Blockwise Minimization

The MOP

minx=y=z=w=1Fα(x,y,z,w)\min_{\|x\|=\|y\|=\|z\|=\|w\|=1} F_\alpha(x, y, z, w)

exhibits a four-block structure, each block constrained to the unit sphere. By freezing three blocks, the subproblem for the remaining block reduces to a quartic–quadratic function composed with a proximal term. Multilinearity ensures that block coordinate descent (BCD) delivers closed-form updates at each step.

This natural cyclic update scheme (xyzw)(x \to y \to z \to w) is the basis for the overall proximal alternating minimization algorithm.

3. Proximal Alternating Minimization Updates

Defining G={x~=(1,xT)T:x=1}\mathcal{G} = \{ \tilde x=(1, x^T)^T : \|x\|=1 \}, the PAM algorithm performs, for k=0,1,2,k=0,1,2,\dots,

{x~(k+1)=argminx~G{Fα(x,y(k),z(k),w(k))+γ12x~x~(k)2}, y~(k+1)=argminy~G{Fα(x(k+1),y,z(k),w(k))+γ22y~y~(k)2}, z~(k+1)=argminz~G{Fα(x(k+1),y(k+1),z,w(k))+γ32z~z~(k)2}, w~(k+1)=argminw~G{Fα(x(k+1),y(k+1),z(k+1),w)+γ42w~w~(k)2}.\begin{cases} \tilde x^{(k+1)} = \displaystyle\arg\min_{\tilde x \in \mathcal{G}} \left\{ F_\alpha(x, y^{(k)}, z^{(k)}, w^{(k)}) + \frac{\gamma_1}{2}\|\tilde x - \tilde x^{(k)}\|^2 \right\}, \ \tilde y^{(k+1)} = \displaystyle\arg\min_{\tilde y \in \mathcal{G}} \left\{ F_\alpha(x^{(k+1)}, y, z^{(k)}, w^{(k)}) + \frac{\gamma_2}{2}\|\tilde y - \tilde y^{(k)}\|^2 \right\}, \ \tilde z^{(k+1)} = \displaystyle\arg\min_{\tilde z \in \mathcal{G}} \left\{ F_\alpha(x^{(k+1)}, y^{(k+1)}, z, w^{(k)}) + \frac{\gamma_3}{2}\|\tilde z - \tilde z^{(k)}\|^2 \right\}, \ \tilde w^{(k+1)} = \displaystyle\arg\min_{\tilde w \in \mathcal{G}} \left\{ F_\alpha(x^{(k+1)}, y^{(k+1)}, z^{(k+1)}, w) + \frac{\gamma_4}{2}\|\tilde w - \tilde w^{(k)}\|^2 \right\}. \end{cases}

Each block update is a strongly convex quadratic minimization on the sphere, with a closed-form solution: x(k+1)=±gx(k)γ1x(k)gx(k)γ1x(k),x^{(k+1)} = \pm \frac{g_x^{(k)}-\gamma_1 x^{(k)}}{\|g_x^{(k)}-\gamma_1 x^{(k)}\|}, where

gx(k)=(I[2:n+1])Tf,y~(k)z~(k)w~(k)αz~(k),w~(k)y(k).g_x^{(k)} = (I_{[2:n+1]}) \langle \mathcal{T}_f, \tilde y^{(k)} \circ \tilde z^{(k)} \circ \tilde w^{(k)} \rangle - \alpha \langle \tilde z^{(k)},\tilde w^{(k)} \rangle y^{(k)}.

Here, I[2:n+1]I_{[2:n+1]} selects the relevant indices in the lifted variable. The regularization parameters γi>0\gamma_i > 0 ensure the strong convexity of each subproblem.

4. Convergence Properties

Key convergence assumptions include (i) αTf\alpha \geq \|\mathcal{T}_f\| for concavity, (ii) positive proximal coefficients γi\gamma_i, and (iii) compactness of the feasible set G4\mathcal{G}^4. Theoretical results (Theorem 4.1 and 4.2) establish:

  1. Descent: After each full cycle, the objective FαF_\alpha decreases at least by a multiple of the squared-step norm, guaranteeing monotonic convergence.
  2. Vanishing Steps: The sum of squared step sizes is finite; thus, t~(k+1)t~(k)0\|\tilde t^{(k+1)} - \tilde t^{(k)}\| \to 0 as kk\to\infty (with t~(k)\tilde t^{(k)} collecting all lifted variables).
  3. Cluster Points: Compactness ensures existence of accumulation points.
  4. Stationarity: Any cluster point satisfies the KKT conditions for the MOP, i.e., each block’s variational inequality for FαF_\alpha.

5. Computational Complexity Per Iteration

Each block update's primary expense is the tensor contraction Tf,u~v~w~\langle\mathcal{T}_f, \tilde u\circ\tilde v\circ\tilde w\rangle. For a general dense order-4 tensor, this cost scales as O(n4)O(n^4). Nevertheless, when Tf\mathcal{T}_f arises from BEC applications, e.g., via discretized Gross–Pitaevskii energies, the tensor exhibits significant sparsity and symmetry; the dominant cost then reduces to O(n2)O(n^2) or even O(n)O(n) per contraction for 1D and 2D finite-difference grids.

Each block update further requires O(n)O(n) operations for normalization and vector addition, so, in practice, total cost per full iteration is O(n2)O(n^2) in 1D and can approach O(n)O(n) in 2D with optimal exploitation of tensor structure.

6. Hyperparameter Selection and Acceleration

Several algorithmic choices affect performance:

  • Shift parameter α\alpha: Chosen to satisfy αTfmaxTfu41/4\alpha \geq \|\mathcal{T}_f\| \approx \max|\mathcal{T}_f u^4|^{1/4}. Slightly larger α\alpha improves concavity and numerical stability but excessive scaling suppresses the multilinear term, slowing convergence.
  • Proximal coefficients γi\gamma_i: Any positive value yields convergence. In practice, a moderate constant (e.g., $0.5$) balances progress and stability.
  • Initialization: Random unit-norm initialization suffices; marginally improved warm starts may be obtained via a few steps of the power method or MBI.
  • Acceleration strategies: Proximal coefficients may be increased adaptively when little progress is detected, or SQUAREM-type extrapolation may be applied to the iterates for superlinear convergence in early phases.

7. Numerical Experiments and Performance Comparison

Empirical validation employed synthetic BEC data, with both 1D and 2D Gross–Pitaevskii energy discretizations. The algorithm was compared to an inexact ADMM (with Newton-type inner iterations), across a range of discretization sizes:

  • Iteration and runtime efficiency: PAM required $4$–$20$ outer iterations and $0.001$–$0.05$ seconds, while ADMM took $50$–$300$ iterations and $0.02$–$0.2$ seconds, with both methods attaining identical objective values to machine accuracy.
  • Convergence profile: PAM's objective exhibited rapid initial decrease and smooth convergence, contrasting with the slower final-phase descent in ADMM, due to inexact Newton steps.
  • Ground-state recovery: Final minimizers reproduced nonnegative, symmetric, and exponentially decaying ground-state wavefunctions, in agreement with analytical expectations for BEC.
  • Sensitivity to α\alpha: Modest variation of α\alpha from Tf\|\mathcal{T}_f\| to 3Tf3\|\mathcal{T}_f\| had little impact on final objectives, with runtime varying by at most a factor of two. Setting α<Tf\alpha<\|\mathcal{T}_f\| generated instability; large α\alpha slowed progress.

In sum, the tensor-based PAM method exploits the equivalence between inhomogeneous quartic and multilinear formulations to provide efficient blockwise updates with simple convergence guarantees, demonstrating superior per-iteration efficacy compared to ADMM in the context of synthetic BEC tests (Chen et al., 31 Dec 2025).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Tensor-Based Proximal Alternating Minimization Algorithm.