Dimension‐Dependent Rates in Quantum Optimization

Updated 24 October 2025

The paper shows that classical ERM exhibits linear dependence on the dimension d, establishing a fundamental lower bound for sample complexity.
Quantum algorithms employing variance reduction and isotropic noise techniques can improve dimension dependencies, mitigating the curse of dimensionality.
Advanced methods like quantum gradient estimation and Hamiltonian descent yield exponential or quadratic speedups over classical approaches in high-dimensional settings.

Dimension-dependent rates for quantum stochastic convex optimization concern how sample complexity and convergence bounds in stochastic convex optimization scale with the underlying dimension $d$ of the parameter space, particularly when using quantum algorithms or quantum-inspired methods. This topic is central in understanding whether—and how—quantum techniques can circumvent or mitigate the curse of dimensionality that plagues classical empirical risk minimization and related uniform convergence arguments.

1. Classical Dimension-dependent Rates and Lower Bounds

In classical stochastic convex optimization (SCO), suppose one minimizes

$F(x) = \mathbb{E}_{f \sim D}[f(x)]$

over a convex set $\mathcal{K} \subseteq \mathbb{R}^d$ given access to i.i.d. samples $f^1, f^2, ..., f^n$ from $D$ . The standard method is empirical risk minimization (ERM), where one minimizes the empirical objective

$F_S(x) = \frac{1}{n} \sum_{i=1}^n f^i(x).$

A fundamental question concerns how large $n$ must be for $F_S$ to uniformly approximate $F$ over $\mathcal{K}$ :

$|F_S(x) - F(x)| \leq \epsilon \qquad \forall x \in \mathcal{K}$

with high probability $1-\delta$ .

According to covering number arguments, if the functions $f$ are $L$ -Lipschitz over $\mathcal{K}$ (which is contained in a ball of radius $R$ under some $\ell_p$ norm), then the necessary sample size for uniform convergence scales as

$n = O\left( \frac{d (L R)^2 \log \left( \frac{d L R}{\epsilon \delta} \right)}{\epsilon^2} \right)$

This means that, up to log and constant factors, sample complexity is linear in $d$ (Feldman, 2016). The phenomenon is not limited to the analysis: the authors construct explicit lower bounds (using, for example, functions $g_V(x)$ defined via maximal inner product over a packed set of directions $W$ ) showing that for $n \leq d/6$ , ERM may overfit and incur a constant generalization error gap (e.g., at least $1/4$ in the $\ell_2$ setting).

Generalizing to $\ell_p/\ell_q$ geometry (with $1/p + 1/q = 1$), the sample complexity similarly cannot be reduced below $O(d)$ (see formula for $g_{p,V}(x)$ ).

Regularization and smoothness do not alleviate this bound: even with quadratically smoothed max functions (that are $1$-smooth) or $\ell_1$ -regularization, the lower bound persists essentially unchanged (see Theorems 3 and 4 in (Feldman, 2016)).

In contrast, alternative algorithmic approaches—such as online-to-batch conversions, algorithmic stability, or structure-aware methods—can bypass linear dimension dependence. For instance, in ℓ₂/ℓ₂ settings, algorithms exist with sample complexity $O(1/\epsilon^2)$ , omitting explicit $d$ dependence; for ℓ₁/ℓ∞ cases, rates depending only logarithmically on $d$ are possible. Thus, the curse of dimensionality is inherent only for ERM and uniform convergence-based guarantees.

2. Quantum Algorithmic Models: Inherited and Improved Dimensional Dependence

Quantum algorithms for stochastic convex optimization (QSCO) must be analyzed for whether they inherit classical lower bounds or potentially offer dimension-independent improvements.

Key result: Quantum algorithms mimicking ERM/uniform convergence cannot improve the sample complexity in the worst case, as the same lower bounds apply (Feldman, 2016).

However, quantum variance reduction, mean estimation, quantum subgradient estimation, and other advanced algorithms can offer improved tradeoffs between dimension $d$ and accuracy $\epsilon$ , particularly outside the standard ERM paradigm.

For instance, certain quantum algorithms for SDPs (semidefinite programs) decouple dimension and constraint dependencies:

Query complexity $\widetilde{O}(s^2(\sqrt{m}\epsilon^{-10} + \sqrt{n}\epsilon^{-12}))$ for plain input models;
$\widetilde{O}(\sqrt{m} + \mathrm{poly}(r))\cdot \mathrm{poly}(\log m, \log n, B, \epsilon^{-1})$ for quantum input models (Brandão et al., 2017).

General convex optimization admits an algorithm with $\widetilde{O}(n)$ quantum membership/function queries, yielding a quadratic improvement over classical algorithms, but lower bounds of $\widetilde{\Omega}(\sqrt{n})$ persist for oracle-based models (Chakrabarti et al., 2018).

For non-smooth, first-order convex problems, gradient descent is essentially optimal with respect to dimension-independence; quantum algorithms cannot improve upon the $O(GR/\epsilon)^2$ scaling in the black-box setting (Garg et al., 2020).

3. Explicit Quantum Improvements in Certain Regimes

Recent work has demonstrated provable quantum speedups for stochastic convex optimization that achieve improved dependence on $\epsilon$ at the expense of polynomial dependence on $d$ . For example, quantum algorithms utilizing multivariate mean estimation achieve query complexity

$\min \left\{ d^{5/8} (LR/\epsilon)^{3/2}, d^{3/2}(LR/\epsilon) \right\}$

for Lipschitz convex function minimization (Sidford et al., 2023), and similar improvements for finding ε-critical points in non-convex problems.

In the high-dimensional regime, quantum algorithms for robust optimization employing state preparation and multi-sampling achieve quadratic improvements in $d$ over the classical $md$ scaling (with $m$ the number of noise vectors) (Lim et al., 2023).

Isotropic noise models further improve dimension-dependent rates. Algorithms leveraging isotropic stochastic gradient oracles (ISGO) can achieve complexities of $O(R^2 \sigma^2 / \epsilon^2 + d)$ , thereby improving upon variance-only models by a factor of $d$ . The quantum isotropifier, a quantum algorithm converting variance-bounded to isotropic error, enables $O(d R \sigma_V / \epsilon)$ quantum query complexity with matching lower bounds (Marsden et al., 23 Oct 2025).

4. Dimension-Dependence in Mirror Descent and Private/Stochastic Algorithms

Dimension-independent rates—up to logarithmic factors—have been achieved for stochastic mirror descent in differentially private convex-concave saddle point problems. For smooth convex objectives over polyhedral domains:

$\text{Gap} = \sqrt{ \frac{\log d}{n} } + \left( \frac{ \log^{3/2} d }{ n \varepsilon } \right)^{1/3}$

Under second-order smoothness, further improvements to the privacy term exponent are possible (González et al., 5 Mar 2024). These rates are nearly optimal and apply to broader classes than those typically accessible to Frank-Wolfe-type algorithms. The bias-reduced gradient estimators exploiting Maurey sparsification yield convergence rates with dependence only on $\log d$ . While primarily developed for DP-SCO, these techniques are plausible candidates for adaptation to quantum settings where extremely large $d$ is prevalent.

Similar improvements are seen in user-level DP-SCO, where optimal excess risk rates require the number of users only to scale logarithmically in $d$ via advanced private mean estimation and outlier removal (Asi et al., 2023).

5. Quantum Gradient Methods with Dimension-Independent Complexity

Recent quantum algorithms based on quantum gradient and subgradient estimation (using phase kickback and superposition protocols) have established dimension-independent query complexities, up to polylogarithmic factors, for zeroth-order convex optimization (using only function evaluation oracles) in both smooth and nonsmooth settings:

Smooth convex: $\tilde{O}(L R^2 / \epsilon)$
Nonsmooth convex: $\tilde{O}((G R / \epsilon)^2)$

This matches the first-order complexities and demonstrates an exponential separation between quantum and classical zeroth-order algorithms, since classical methods require dimension-dependent averaging (Augustino et al., 21 Mar 2025). These quantum routines also generalize to non-Euclidean domains via mirror descent, dual averaging, and mirror-prox.

6. Quantum Dynamical Algorithms and Discretization Limits

Quantum Hamiltonian Descent (QHD) realizes convex optimization by simulating Schrödinger dynamics with a Hamiltonian incorporating the objective function. Although continuous-time QHD admits arbitrarily rapid convergence, discretization cost constraints enforce an overall query complexity of

$\tilde{O}\left( d^{1.5} G^2 R^2 / \epsilon^2 \right)$

for $d$ -dimensional $G$ -Lipschitz convex objectives (Chakrabarti et al., 31 Mar 2025). In the noiseless regime, this does not improve upon classical bounds, but the quantum algorithm tolerates lower noise (of order $\tilde{O}(\epsilon^3 / d^{1.5} G^2 R^2)$ ). In high dimensions and noisy (or stochastic) settings, this translates into a super-quadratic quantum speedup over all classical algorithms with the same noise robustness.

7. Context, Implications, and Outlook

The preponderance of results demonstrates that the curse of dimensionality is inherent to uniform convergence/ERM in classical stochastic convex optimization. Quantum algorithms can bypass this curse—but only if they diverge from the ERM/uniform convergence paradigm, or exploit special structures (e.g., robustness, isotropic noise, or quantum gradient estimation).

In high-dimensional quantum settings (such as quantum state tomography and quantum learning), dimension-independent or polylogarithmic-in- $d$ rates are possible only when the algorithm leverages advanced sampling, gradient estimation, or structural properties—rather than relying on empirical risk minimization.

These findings underscore the importance of the design and selection of optimization methods for both quantum and classical high-dimensional problems. Quantum stochastic convex optimization is not a panacea: its dimension-dependent rates are subject to the fundamental limits established for classical algorithms unless quantum structure is deliberately exploited. Further research investigates new oracle models, functional families, and algorithmic techniques—such as debiasing, mean estimation, and quantum isotropification—that may allow broader classes of quantum optimization tasks to break the curse of dimensionality.

Table: Dimension Dependence in Quantum Stochastic Convex Optimization

Method / Setting	Classical Rate	Quantum Rate
ERM / Uniform convergence	$O\left( \frac{d (L R)^2}{\epsilon^2} \right)$	No improvement for ERM-based quantum algorithms
SDP (quantum input model)	$O(m n)$ or worse	$\tilde{O}(\sqrt{m} + \text{poly}(r))$ (Brandão et al., 2017)
ℓ₂/ℓ₂ stochastic opt.	$O(1/\epsilon^2)$ for special algorithms	$O(1/\epsilon^2)$ (structure-dependent)
Mirror descent (DP setting)	—	$O(\sqrt{\log d/n} + (\log^{1.5} d / n\varepsilon )^{1/3})$ (González et al., 5 Mar 2024)
Isotropic noise (classical)	$O(R^2 \sigma^2/\epsilon^2 + R^2 L^2/\epsilon^2)$	$O(R^2 \sigma^2/\epsilon^2 + d)$ (Marsden et al., 23 Oct 2025)
Quantum gradient methods	$O(d)$ (zeroth order)	$O(1)$ (up to polylog factors) (Augustino et al., 21 Mar 2025)
QHD, stochastic setting	$O(d^4 (G R/\epsilon)^6)$	$O(d^3 (G R / \epsilon )^5)$ (Chakrabarti et al., 31 Mar 2025)

The dimension-dependent rates for (quantum) stochastic convex optimization are therefore a central topic for designing scalable algorithms in high-dimensional settings, and the results to date delineate both the limits and potential algorithmic strategies for escaping the heavy burden of dimension in such problems.