Papers
Topics
Authors
Recent
2000 character limit reached

Quasar-Convexity: Optimizing Nonconvex Functions

Updated 26 November 2025
  • Quasar-convexity is a relaxation of convexity defined by a one-point gradient condition that bridges star-convexity and general nonconvexity for global optimization guarantees.
  • Algorithms leveraging quasar-convexity, including deterministic, stochastic, and proximal methods, achieve accelerated convergence rates similar to those in convex optimization.
  • The framework has practical applications in machine learning, dynamical systems, reinforcement learning, and distributed or minimax settings, extending classical optimization bounds.

Quasar-convexity is a structural relaxation of convexity that characterizes a broad class of nonconvex functions enabling global optimization guarantees typically unattainable in generic nonconvex settings. This concept, and its generalizations such as strong quasar-convexity, proximal-quasar-convexity, and generalized quasar-convexity (GQC), provide frameworks in which first-order and proximal-type algorithms attain accelerated convergence rates comparable to convex optimization, with substantial applications in machine learning, dynamical systems, and reinforcement learning. Quasar-convexity interpolates between star-convexity and broader one-point relaxation hierarchies, supporting both deterministic and stochastic algorithmic guarantees and extending naturally to constrained, stochastic, distributed, and minimax settings.

1. Foundational Definitions and Generalizations

The canonical definition for a differentiable function f:Rn→Rf : \mathbb{R}^n \rightarrow \mathbb{R} with a global minimizer x∗x^* and γ∈(0,1]\gamma \in (0,1] is: f(x∗)≄f(x)+1γ⟹∇f(x),x∗−x⟩∀x∈Rn,f(x^*) \geq f(x) + \frac{1}{\gamma} \langle \nabla f(x), x^* - x \rangle \quad \forall x \in \mathbb{R}^n, alternatively,

f(x)−f(x∗)≀1γ⟹∇f(x),x−x∗⟩.f(x) - f(x^*) \leq \frac{1}{\gamma} \langle \nabla f(x), x - x^* \rangle.

When Îł=1\gamma=1 this is star-convexity; for general Îł\gamma it allows nonconvexity, relaxing global affine support to a one-point condition anchored at the minimizer. Strong quasar-convexity adds a quadratic term: f(x∗)≄f(x)+1γ⟹∇f(x),x∗−x⟩+ÎŒ2∄x∗−x∄2,f(x^*) \geq f(x) + \frac{1}{\gamma} \langle \nabla f(x), x^* - x \rangle + \frac{\mu}{2} \|x^* - x\|^2, with ÎŒ>0\mu > 0 yielding uniqueness and robust error contraction (Jin, 2020, Hermant et al., 30 May 2024, Brito et al., 4 Sep 2025, Khanh et al., 28 Oct 2025).

Block-structured settings, as formalized in Generalized Quasar-Convexity (GQC), assign per-block parameters Îłi\gamma_i for variables xix_i, defined over product spaces (e.g., probability simplices): f(x∗)−f(x)≄∑i=1d1Îłi⟹Fi(x),xi∗−xi⟩,f(x^*) - f(x) \geq \sum_{i=1}^d \frac{1}{\gamma_i} \langle F_i(x), x^*_i - x_i \rangle, where FiF_i is a general internal oracle, often but not necessarily a gradient; smaller Îłi\gamma_i signifies greater nonconvexity per block (Ding et al., 20 Jul 2024). Extension to minimax settings defines Generalized Quasar-Convexity-Concavity (GQCC) with surrogate operators and weighting.

In constrained or compositional scenarios, proximal-quasar-convexity replaces ∇f\nabla f with the proximal-gradient mapping, ensuring the structure persists under constraints (Farzin et al., 4 May 2025, Martínez-Rubio, 2 Oct 2025).

2. Algorithmic Implications and Complexity Results

Quasar-convexity enables first-order and related algorithms to achieve convergence rates much sharper than for general nonconvex functions, frequently matching those in convex optimization up to Îł\gamma-dependent factors.

  • Deterministic Gradient Descent and Acceleration: For smooth, Îł\gamma-quasar-convex ff, deterministic accelerated methods achieve

f(xT)−f(x∗)=O~(L∄x0−x∗∄2ÎłT2),f(x_T) - f(x^*) = \widetilde{O}\left(\frac{L \|x_0 - x^*\|^2}{\gamma T^2}\right),

while strongly quasar-convex functions permit linear rates:

f(xT)−f(x∗)≀C(1âˆ’ÎłÎŒ/L)T.f(x_T) - f(x^*) \leq C (1 - \gamma \sqrt{\mu / L})^T.

These mirror classic convex/strongly convex scenarios, with degradation in the 1/Îł1/\gamma factor (Jin, 2020, Wang et al., 2023, Hermant et al., 30 May 2024).

  • Stochastic Optimization: Under quasar-convexity, SGD and variance-reduced methods yield O(1/T)O(1/\sqrt{T}) to O(1/T)O(1/T) convergence, and O(log⁥(1/Ï”))O(\log(1/\epsilon)) for strong variants. Adaptive stochastic mirror-descent frameworks (e.g., QASGD, QASVRG) exploit this for finite-sum and online settings (Fu et al., 2023).
  • Zeroth-order (Gaussian smoothing) algorithms: Randomized algorithms using smoothed function value queries (instead of gradients) inherit complexity bounds O(n/Ï”)O(n/\epsilon) (QC) or O(nlog⁥(1/Ï”))O(n \log(1/\epsilon)) (SQC), with variance-reduction tightening solution neighborhoods (Farzin et al., 4 May 2025).
  • Proximal Point and Constrained Methods: The proximal point algorithm (PPA), when applied to quasar-convex functions, attains sublinear O(1/Ï”)O(1/\epsilon) complexity; for strong quasar-convexity, linear contraction and O(log⁥(1/Ï”))O(\log(1/\epsilon)) (Brito et al., 4 Sep 2025). Accelerated schemes for constrained quasar-convex minimization achieve O(1~/(ÎłÏ”))O(\tilde{1}/(\gamma\sqrt{\epsilon})) rates, with projected gradient descent and Frank–Wolfe procedures scaling sublinearly in 1/(Îł2Ï”)1/(\gamma^2\epsilon) (MartĂ­nez-Rubio, 2 Oct 2025).
  • Optimistic Mirror Descent in GQC/GQCC: For multi-block or multi-distribution setups, OMD achieves adaptive convergence:

O~((∑i=1d1/Îłi)ϔ−1),\widetilde{O}\left(\left( \sum_{i=1}^d 1/\gamma_i \right)\epsilon^{-1} \right),

which is strictly faster than standard mirror descent in block dimension dd (Ding et al., 20 Jul 2024).

Quasar-convexity sits strictly between star-convexity (one-point, Îł=1\gamma=1), strong star-convexity, and convexity, encapsulating a one-point landscape lower bound rather than a global pairwise affine minorization. The inclusion chain is: (strong convex)  âŸč  (strong star-convex)  âŸč  (strong quasar-convex).\text{(strong convex)} \implies \text{(strong star-convex)} \implies \text{(strong quasar-convex)}. It is strictly stronger than Polyak-Ɓojasiewicz (PL) and weak quasi-convexity inequalities and is distinct from weak convexity (curvature bound) or tilted convexity (Pun et al., 4 Jul 2024, Khanh et al., 28 Oct 2025, MartĂ­nez-Rubio, 2 Oct 2025).

Star-quasiconvexity (SSQC), an overclass, unifies convex, star-convex, quasiconvex, and quasar-convex functions and is characterized geometrically by all sublevel sets being star-shaped with respect to the set of global minimizers, admitting linear convergence of both gradient and proximal point algorithms under strong forms (Khanh et al., 28 Oct 2025).

4. Structural and Geometrical Interpretation

Quasar-convex functions enforce that along any ray starting from a minimizer, ff does not develop flat regions or spurious critical points, and the gradient maintains a sufficiently acute angle to the direction toward minimizers. This avoids pathological local minima but allows rich nonconvexity away from the minimizer, including regions of negative curvature, oscillation, or complicated level sets (especially in composite constructions f(∄x∄)⋅g(x/∄x∄)f(\|x\|)\cdot g(x/\|x\|)) (Hermant et al., 30 May 2024, Brito et al., 4 Sep 2025).

GQC extends this to product spaces and block-wise landscapes, permitting distinct convexity-like parameters per variable block and accommodating general function oracles.

5. Applications and Model Classes

Quasar-convexity and its variants have been identified in several high-impact model classes:

6. Extensions: Minimax, Online, and Distributed Optimization

Quasar-convexity generalizes to online and dynamic settings, with regret bounds scaling in path variation and cumulative noise, supporting settings where minimizers drift over time (Pun et al., 4 Jul 2024). In minimax and game-theoretic optimization (GQCC), block-wise surrogate functions and contraction mappings ensure that decentralized variants of OMD deliver nearly optimal Nash-equilibrium finding with explicit iteration-complexity controlled by composite block-wise parameters (Ding et al., 20 Jul 2024). These tools have demonstrated tight last-iterate and average iterate bounds, removing dependency on problem dimension in several scenarios.

7. Open Directions and Limitations

Several open problems remain. The dependence of rates on 1/Îł1/\gamma may not be tight, and lower bounds in stochastic, higher-order, or variance-reduced methods are only partially understood (Jin, 2020). Extensions to infinite-dimensional, non-Euclidean, or higher-order (e.g., Riemannian) settings are active research areas, as is the identification of further problem classes with GQC or SQC structure (MartĂ­nez-Rubio, 2 Oct 2025, Farzin et al., 4 May 2025). Certain convexification, acceleration, or regularization tricks employed in convex optimization do not naively transfer to the quasar-convex setting because of the one-point anchoring requirement.


Table: Summary of Algorithmic Complexities under Quasar-Convexity

Setting Condition Algorithm/Class Iteration/Oracle Complexity
Unconstrained, deterministic Îł\gamma-QC, LL-smooth (A)GD O~(LR2/(ÎłÏ”))\widetilde{O}(\sqrt{L R^2/(\gamma \epsilon)})
Unconstrained, strongly QC (Îł,ÎŒ)(\gamma,\mu)-SQC (A)GD, PPA O(log⁥(1/Ï”)/Îł)O(\log(1/\epsilon)/\gamma)
SGD (stochastic) Îł\gamma-QC QASGD O(R2L/(ÎłÏ”)+R2σ2/(Îł2Ï”2))O(R^2 L/(\gamma\epsilon) + R^2\sigma^2/(\gamma^2\epsilon^2))
Proximal point algorithm (SQC) (Îș,Îł)(\kappa, \gamma)-SQC PPA O(log⁥(1/Ï”))O(\log(1/\epsilon))
OMD (blockwise, GQC) GQC(Îłi\gamma_i) OMD O~((∑i1/Îłi)ϔ−1)\widetilde{O}((\sum_i 1/\gamma_i) \epsilon^{-1})
Zeroth-order (QC/SQC) (P)QC/SQC, Lipschitz ZO-GS O(n/Ï”)O(n/\epsilon) (QC), O(nlog⁥(1/Ï”))O(n\log(1/\epsilon)) (SQC)
Constrained (QC) Proximal QC Acc. Prox-Point O(1~/(ÎłÏ”))O(\widetilde{1}/(\gamma\sqrt{\epsilon}))

All rates are for Ï”\epsilon-optimality in f(x)−f(x∗)≀ϔf(x) - f(x^*) \leq \epsilon. Here, LL is the smoothness constant, RR the diameter, nn the ambient (block) dimension, σ2\sigma^2 the variance, and Îș\kappa a mixing parameter.


Quasar-convexity and its generalizations unify and extend much of the landscape-friendly structure that enables tractable nonconvex optimization, acting as a bridge between the theory of convex analysis and the real-world solvability of complex learning and control problems. The landscape-structural and algorithmic innovation around these classes is a major driver of progress in scalable first-order nonconvex optimization.

Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Quasar-Convexity.