Papers
Topics
Authors
Recent
Search
2000 character limit reached

Adaptive FISTA: Advanced Optimization Variants

Updated 5 June 2026
  • Adaptive FISTA is a framework of optimization algorithms that dynamically adjusts factors like step-size, momentum, and restart schedules based on local curvature and statistical properties.
  • It employs adaptive momentum, parameter-free backtracking, and restart mechanisms to address oscillations, ill-conditioning, and nonconvex challenges in various applications.
  • These methods improve practical convergence and efficiency in large-scale, non-Euclidean, and data-driven problems while closely approximating FISTA’s optimal worst‐case rates.

Adaptive FISTA encompasses a family of Fast Iterative Shrinkage-Thresholding Algorithm (FISTA) variants in which one or more algorithmic parameters—step-size, momentum, regularization, prox metric, or restart schedule—are dynamically adjusted based on local information or statistical properties, rather than fixed a priori. These adaptations aim to improve empirical convergence, robustness to ill-conditioning, applicability in non-convex or non-Euclidean settings, and sample efficiency, while often retaining (or closely approximating) FISTA’s optimal worst-case theoretical rate.

1. Foundations: FISTA and the Need for Adaptivity

FISTA solves composite minimization problems of the form

minxRnf(x):=Ψ(x)+h(x)\min_{x \in \mathbb{R}^n} \quad f(x) := \Psi(x) + h(x)

where Ψ\Psi is convex (possibly nonsmooth), hh is convex and differentiable with h\nabla h Lipschitz. The classic form uses fixed parameters—step-size $1/L$, Nesterov-style momentum with the canonical tt-recursion, and a prescribed number of iterations or tolerance.

Despite its O(1/k2)O(1/k^2) optimal convergence for convex problems, classic FISTA exhibits several practical limitations:

  • Sensitivity to local curvature and the global Lipschitz constant LL
  • Oscillatory trajectories and potential lack of sequence convergence
  • Inability to exploit strong convexity or adapt to varying local smoothness
  • Lack of mechanisms for efficient handling of nonconvexity, discrete adaptive discretizations, or data-driven structure

This motivates adaptive FISTA schemes that dynamically tune parameters using observed progress, local surrogate models, or problem-specific side information (Liang et al., 2018, Alamo et al., 2019, Ochs et al., 2017).

2. Adaptive Acceleration and Restart Mechanisms

Adaptive FISTA variants deploy several strategies for parameter and schedule adaptation:

  • Adaptive Momentum and “Lazy Start”:

FISTA-Mod introduces (p,q,r)(p,q,r) as free parameters into the Nesterov tt-update. Smaller Ψ\Psi0 “slows” the approach of Ψ\Psi1 to 1, suppressing oscillations and improving convergence in practice. This lazy-start approach can accelerate convergence by an order of magnitude over classical settings for “pathological” problems (Liang et al., 2018).

  • Adaptive Restart via Function/Momentum Criteria:

Restarts can suppress detrimental oscillations. The LCR-FISTA (“Linearly Convergent Restart FISTA”) variant introduces a globally linearly convergent restart rule for composite convex problems with quadratic functional growth (QFG): FISTA is run in repeated inner loops, each terminated when a local functional decrease criterion is met, with loop length adaptively doubled if geometric decrease is not observed. No knowledge of Ψ\Psi2 or the QFG parameter Ψ\Psi3 is required; the resulting convergence rate is Ψ\Psi4 in outer-loop count (Alamo et al., 2019).

  • Parameter-Free Backtracking and Online Conditioning Estimation:

Free-FISTA couples adaptive backtracking for Ψ\Psi5, non-monotone step size increases/decreases, and a restart schedule that computes online estimates for Ψ\Psi6 via functional decreases. When QFG holds, the method achieves an accelerated linear rate Ψ\Psi7 in function value, without a priori knowledge of any problem constant (Aujol et al., 2023).

  • Gradient and Subspace Adaptivity:

FISTA variants—such as those with spatially adaptive discretizations for Banach-space LASSO or wavelet/tomographic recovery—adaptively refine the computational basis in which the proximal mapping (or the whole iterate) is computed, ensuring that the number of degrees of freedom increases only as necessary to achieve a prescribed accuracy (Chambolle et al., 2021).

Adaptive Feature Example Method Key Reference
Adaptive momentum FISTA-Mod/Lazy-Start (Liang et al., 2018)
Adaptive restart LCR-FISTA, Free-FISTA (Alamo et al., 2019, Aujol et al., 2023)
Adaptive step-size Backtracking FISTA (Aujol et al., 2023)
Average curvature AC-FISTA (Liang et al., 2021)
Discretization adapts Banach-space FISTA (Chambolle et al., 2021)
Regional/learned adapt. RDFNet (“regional FISTA”) (Zhou et al., 2023)

3. Step-Size and Curvature Adaptation

Adaptive FISTA methods often avoid fixed global step-sizes:

  • Backtracking: Step sizes are shrunk/aggressively increased based on model fit at each iteration. Non-monotone backtracking FISTA increases the step when the local quadratic model is accurate and decreases otherwise (Aujol et al., 2023, Nguyen et al., 2024, Rebegoldi et al., 2021, Calatroni et al., 2021).
  • Average Curvature Tracking: AC-FISTA dispenses with explicit line search, maintaining a moving average of “model” local upper curvatures (computed from observed nonlinearity), which is used for both step-size and model-building (Liang et al., 2021). This yields step-size adaptation without backtracking overhead and rates commensurate with the best known for smooth composite acceleration.
  • Strong Convexity Adaptation: When strong convexity (or QFG) is detected or assumed, the momentum parameter is adjusted adaptively; LCR-FISTA and Free-FISTA both adapt the inner loop or momentum by measuring realized functional contractions and without explicit knowledge of Ψ\Psi8 (Alamo et al., 2019, Aujol et al., 2023).
  • Nonconvex Generalization: VAR-FISTA and aFISTA adaptively regularize subproblems based on local negative curvature, switching between (accelerated) FISTA and more robust nonconvex variants on-the-fly, with rates that interpolate between Ψ\Psi9 (convex) and hh0 or worse (general nonconvex) (Sim, 2020, Ochs et al., 2017).

4. Adaptivity Beyond the Algorithmic Core: Metrics, Discretization, and Learning

  • Variable Metric and Inexact Proximal Mapping: SAGE-FISTA and its convex variant S-FISTA generalize FISTA to variable-metric proximal steps (using pre-conditioning or split-gradient adaptation), significant for imaging problems and ill-conditioning (Rebegoldi et al., 2021, Calatroni et al., 2021). The metric hh1 is chosen dynamically, and both the forward and proximal steps may be computed inexactly, with controlled tolerance decay, while adaptive backtracking governs step-size.
  • Adaptive Subspace Selection: For problems where minimizers are not in the native Hilbert space but in a Banach space (e.g., hh2 or sparse measures), FISTA can be applied on a sequence of adaptively refined subspaces. Under verifiable energy approximation rates, convergence of order hh3 (where hh4 quantifies solution structure) is achieved (Chambolle et al., 2021).
  • Data-Driven and Regionwise Adaptivity: Unfolded deep networks with FISTA-style blocks (FISTA-Net, RDFNet) implement adaptivity via learnable, phase-dependent step-sizes, thresholds, momentum, and even region-dependent transformation domains. For example, RDFNet partitions the feature maps by region, provides learnable regional transforms, and pixelwise adaptive soft-thresholding, dramatically outperforming fixed global FISTA or FISTA-Net on spectral snapshot compressive imaging (Zhou et al., 2023).

5. Theory: Guarantees and Complexity of Adaptive FISTA

  • Convergence Rates (Convex/Strongly Convex/QFG):
    • Classic FISTA: hh5 decay for function value gap.
    • Adaptive FISTA with QFG: Linear convergence hh6 outer-restarts for LCR-FISTA (Alamo et al., 2019); hh7 for Free-FISTA under quadratic functional growth [(Aujol et al., 2023)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Adaptive FISTA.