Tensor-Based Proximal Alternating Minimization

Updated 7 January 2026

The paper introduces a novel tensor-based proximal alternating minimization method that reformulates inhomogeneous quartic optimization into a four-block multilinear problem, enabling efficient closed-form updates.
It employs block coordinate descent with proximal regularization, ensuring strong convexity of each subproblem and guaranteeing convergence under mild assumptions.
Empirical studies on Bose–Einstein condensate simulations demonstrate rapid convergence and lower computational cost compared to inexact ADMM across various discretizations.

A tensor-based proximal alternating minimization (PAM) algorithm is a numerical optimization approach developed for solving inhomogeneous quartic polynomial optimization problems on the sphere, with a structure inspired by applications such as discretized Bose–Einstein condensate (BEC) ground state computations. The method constructs an equivalence between a fourth-degree inhomogeneous polynomial minimization and a four-block multilinear optimization problem (MOP), exploiting tensor representations and block coordinate descent with proximal regularization for efficient, closed-form iterative updates. The convergence of the algorithm is established under mild assumptions, and empirical studies demonstrate notable gains compared to alternative methods such as inexact ADMM (Chen et al., 31 Dec 2025).

1. Problem Formulation and Tensor-Multilinear Equivalence

The method addresses minimization of an inhomogeneous quartic polynomial of the form: $\min_{x\in\R^n}\; f(x)\;=\;\frac{\theta}{2}\sum_{i=1}^n x_i^4\;+\;x^T B x \quad\text{s.t.}\;\|x\|=1,$ where $\theta > 0$ and $B\in\R^{n\times n}$ is symmetric. Any degree-4 polynomial $g(x)$ in $n$ variables can be expressed via a symmetric order-4 tensor $\mathcal{T}$ as $g(x)=\langle \mathcal{T},x\circ x\circ x\circ x\rangle$ . In the inhomogeneous case, the variable is lifted to $\tilde x=(1, x^T)^T \in \R^{n+1}$ , and the homogenized tensor $\mathcal{T}_f\in\S^{4,n+1}$ encodes the full objective.

Defining a corresponding four-way multilinear function,

$F(x,y,z,w) = \langle \mathcal{T}_f, \tilde x\circ\tilde y\circ\tilde z\circ\tilde w\rangle,$

with each $\tilde x, \tilde y, \tilde z, \tilde w \in \R^{n+1}$ (first entry fixed at 1, remainder from the respective variables), a key result is that, assuming a mild concavity condition on $u\mapsto\mathcal{T}_f u^4$ , minimization of the original quartic is equivalent to minimizing $F$ over four blocks constrained to the unit sphere: $\min_{\|x\|=1} f(x) = \min_{\|x\|=\|y\|=\|z\|=\|w\|=1} F(x, y, z, w).$ To guarantee global concavity, a shift $-\alpha\|\tilde x\|^4$ is introduced, with $\alpha \geq \|\mathcal{T}_f\|$ . The modified problem is

$f_\alpha(x) = \langle \mathcal{T}_f, \tilde x^4 \rangle - \alpha \|\tilde x\|^4,$

and its multilinear equivalent is

$F_\alpha(x, y, z, w) = \langle \mathcal{T}_f, \tilde x\circ\tilde y\circ\tilde z\circ\tilde w \rangle - \alpha \langle \tilde x, \tilde y \rangle \langle \tilde z, \tilde w \rangle,$

with the minimizers of both formulations coinciding under these conditions (Chen et al., 31 Dec 2025).

2. Multi-Block Structure and Blockwise Minimization

The MOP

$\min_{\|x\|=\|y\|=\|z\|=\|w\|=1} F_\alpha(x, y, z, w)$

exhibits a four-block structure, each block constrained to the unit sphere. By freezing three blocks, the subproblem for the remaining block reduces to a quartic–quadratic function composed with a proximal term. Multilinearity ensures that block coordinate descent (BCD) delivers closed-form updates at each step.

This natural cyclic update scheme $(x \to y \to z \to w)$ is the basis for the overall proximal alternating minimization algorithm.

3. Proximal Alternating Minimization Updates

Defining $\mathcal{G} = \{ \tilde x=(1, x^T)^T : \|x\|=1 \}$ , the PAM algorithm performs, for $k=0,1,2,\dots$ ,

$\begin{cases} \tilde x^{(k+1)} = \displaystyle\arg\min_{\tilde x \in \mathcal{G}} \left\{ F_\alpha(x, y^{(k)}, z^{(k)}, w^{(k)}) + \frac{\gamma_1}{2}\|\tilde x - \tilde x^{(k)}\|^2 \right\}, \ \tilde y^{(k+1)} = \displaystyle\arg\min_{\tilde y \in \mathcal{G}} \left\{ F_\alpha(x^{(k+1)}, y, z^{(k)}, w^{(k)}) + \frac{\gamma_2}{2}\|\tilde y - \tilde y^{(k)}\|^2 \right\}, \ \tilde z^{(k+1)} = \displaystyle\arg\min_{\tilde z \in \mathcal{G}} \left\{ F_\alpha(x^{(k+1)}, y^{(k+1)}, z, w^{(k)}) + \frac{\gamma_3}{2}\|\tilde z - \tilde z^{(k)}\|^2 \right\}, \ \tilde w^{(k+1)} = \displaystyle\arg\min_{\tilde w \in \mathcal{G}} \left\{ F_\alpha(x^{(k+1)}, y^{(k+1)}, z^{(k+1)}, w) + \frac{\gamma_4}{2}\|\tilde w - \tilde w^{(k)}\|^2 \right\}. \end{cases}$

Each block update is a strongly convex quadratic minimization on the sphere, with a closed-form solution: $x^{(k+1)} = \pm \frac{g_x^{(k)}-\gamma_1 x^{(k)}}{\|g_x^{(k)}-\gamma_1 x^{(k)}\|},$ where

$g_x^{(k)} = (I_{[2:n+1]}) \langle \mathcal{T}_f, \tilde y^{(k)} \circ \tilde z^{(k)} \circ \tilde w^{(k)} \rangle - \alpha \langle \tilde z^{(k)},\tilde w^{(k)} \rangle y^{(k)}.$

Here, $I_{[2:n+1]}$ selects the relevant indices in the lifted variable. The regularization parameters $\gamma_i > 0$ ensure the strong convexity of each subproblem.

4. Convergence Properties

Key convergence assumptions include (i) $\alpha \geq \|\mathcal{T}_f\|$ for concavity, (ii) positive proximal coefficients $\gamma_i$ , and (iii) compactness of the feasible set $\mathcal{G}^4$ . Theoretical results (Theorem 4.1 and 4.2) establish:

Descent: After each full cycle, the objective $F_\alpha$ decreases at least by a multiple of the squared-step norm, guaranteeing monotonic convergence.
Vanishing Steps: The sum of squared step sizes is finite; thus, $\|\tilde t^{(k+1)} - \tilde t^{(k)}\| \to 0$ as $k\to\infty$ (with $\tilde t^{(k)}$ collecting all lifted variables).
Cluster Points: Compactness ensures existence of accumulation points.
Stationarity: Any cluster point satisfies the KKT conditions for the MOP, i.e., each block’s variational inequality for $F_\alpha$ .

5. Computational Complexity Per Iteration

Each block update's primary expense is the tensor contraction $\langle\mathcal{T}_f, \tilde u\circ\tilde v\circ\tilde w\rangle$ . For a general dense order-4 tensor, this cost scales as $O(n^4)$ . Nevertheless, when $\mathcal{T}_f$ arises from BEC applications, e.g., via discretized Gross–Pitaevskii energies, the tensor exhibits significant sparsity and symmetry; the dominant cost then reduces to $O(n^2)$ or even $O(n)$ per contraction for 1D and 2D finite-difference grids.

Each block update further requires $O(n)$ operations for normalization and vector addition, so, in practice, total cost per full iteration is $O(n^2)$ in 1D and can approach $O(n)$ in 2D with optimal exploitation of tensor structure.

6. Hyperparameter Selection and Acceleration

Several algorithmic choices affect performance:

Shift parameter $\alpha$ : Chosen to satisfy $\alpha \geq \|\mathcal{T}_f\| \approx \max|\mathcal{T}_f u^4|^{1/4}$ . Slightly larger $\alpha$ improves concavity and numerical stability but excessive scaling suppresses the multilinear term, slowing convergence.
Proximal coefficients $\gamma_i$ : Any positive value yields convergence. In practice, a moderate constant (e.g., $0.5$) balances progress and stability.
Initialization: Random unit-norm initialization suffices; marginally improved warm starts may be obtained via a few steps of the power method or MBI.
Acceleration strategies: Proximal coefficients may be increased adaptively when little progress is detected, or SQUAREM-type extrapolation may be applied to the iterates for superlinear convergence in early phases.

7. Numerical Experiments and Performance Comparison

Empirical validation employed synthetic BEC data, with both 1D and 2D Gross–Pitaevskii energy discretizations. The algorithm was compared to an inexact ADMM (with Newton-type inner iterations), across a range of discretization sizes:

Iteration and runtime efficiency: PAM required $4$–$20$ outer iterations and $0.001$–$0.05$ seconds, while ADMM took $50$–$300$ iterations and $0.02$–$0.2$ seconds, with both methods attaining identical objective values to machine accuracy.
Convergence profile: PAM's objective exhibited rapid initial decrease and smooth convergence, contrasting with the slower final-phase descent in ADMM, due to inexact Newton steps.
Ground-state recovery: Final minimizers reproduced nonnegative, symmetric, and exponentially decaying ground-state wavefunctions, in agreement with analytical expectations for BEC.
Sensitivity to $\alpha$ : Modest variation of $\alpha$ from $\|\mathcal{T}_f\|$ to $3\|\mathcal{T}_f\|$ had little impact on final objectives, with runtime varying by at most a factor of two. Setting $\alpha<\|\mathcal{T}_f\|$ generated instability; large $\alpha$ slowed progress.

In sum, the tensor-based PAM method exploits the equivalence between inhomogeneous quartic and multilinear formulations to provide efficient blockwise updates with simple convergence guarantees, demonstrating superior per-iteration efficacy compared to ADMM in the context of synthetic BEC tests (Chen et al., 31 Dec 2025).

Markdown Report Issue Upgrade to Chat

References (1)

Tensor Based Proximal Alternating Minimization Method for A Kind of Inhomogeneous Quartic Optimization Problem (2025)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Tensor-Based Proximal Alternating Minimization Algorithm.

Tensor-Based Proximal Alternating Minimization

1. Problem Formulation and Tensor-Multilinear Equivalence

2. Multi-Block Structure and Blockwise Minimization

3. Proximal Alternating Minimization Updates

4. Convergence Properties

5. Computational Complexity Per Iteration

6. Hyperparameter Selection and Acceleration

7. Numerical Experiments and Performance Comparison

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

Tensor-Based Proximal Alternating Minimization

1. Problem Formulation and Tensor-Multilinear Equivalence

2. Multi-Block Structure and Blockwise Minimization

3. Proximal Alternating Minimization Updates

4. Convergence Properties

5. Computational Complexity Per Iteration

6. Hyperparameter Selection and Acceleration

7. Numerical Experiments and Performance Comparison

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research