Random Submanifold Descent

Updated 4 December 2025

Random Submanifold Descent is an optimization framework that restricts high-dimensional problems to random low-dimensional subspaces, improving efficiency and avoiding local minima.
It employs techniques such as affine submanifold sampling, tangent space projections, and block updates to adapt to various constraints and geometries.
The method generalizes coordinate descent to Euclidean and manifold domains, offering scalable performance in complex tasks like dictionary learning and nonconvex optimization.

Random Submanifold Descent refers to a class of optimization algorithms that operate by sequentially restricting a high-dimensional optimization problem to random low-dimensional subspaces or submanifolds, solving the reduced subproblem, and updating the iterate accordingly. This framework generalizes randomized coordinate descent and block descent approaches to broader settings, including general affine subspaces, manifold domains, and nonlinear constraints. Such methods are especially relevant for non-convex, constrained, or high-dimensional problems where standard descent methods are computationally infeasible or likely to be trapped in local minima.

1. Algorithmic Frameworks

The central principle is to restrict the search for minimizers to random submanifolds or subspaces of low dimension at each iteration. For a continuous function $f:\mathbb{R}^n\rightarrow\mathbb{R}$ and constraint set $\mathcal{X}\subset\mathbb{R}^n$ , the method operates as follows: at iteration $k$ , a random affine submanifold $S_k\subset\mathbb{R}^n$ of dimension $d \ll n$ is chosen, $f$ is minimized over $\mathcal{X}\cap S_k$ using a local solver, and the best solution found is adopted if it improves the objective. Formally, the update step is $x^*_{k+1}\in\arg\min_{x \in \mathcal{X}\cap S_k} f(x)$ , with $x^*_k$ the current best point. Multiple strategies exist for sampling the submanifolds: fully random affine spaces, cone-restricted subspaces aligned with the gradient or secant direction, and tangent subspaces when optimizing over manifolds (Pasechnyuk et al., 2023, Gutman et al., 2019).

For problems on manifold domains, the tangent subspace descent (TSD) generalizes randomized coordinate descent by projecting the Riemannian gradient onto random subspaces of the tangent space $T_x M$ and computing partial updates via the exponential map: $x^{t+1} = \exp_{x^t}(-\eta P_t\nabla f(x^t))$ , where $P_t$ is the random projector at iteration $t$ (Gutman et al., 2019). Block submanifold approaches and RSSM update restricted blocks or submanifold coordinates, often using polar or QR retractions (Cheung et al., 3 Sep 2024, Han et al., 18 May 2025).

2. Submanifold Sampling and Construction

The construction of random submanifolds is pivotal. In affine-space domains, index sets $B\subset\{1,\ldots,n\}$ (of size $d$ ) are randomly selected, and random matrices $A_k\in\mathbb{R}^{n\times d}$ are formed to parameterize the subspace. For cone-restricted sampling, the principal direction is aligned in a cone around a dominant vector (either the gradient or the secant between previous best points), with each subspace constrained to lie within a prescribed angular interval (Pasechnyuk et al., 2023).

Within manifold optimization settings, sampling often uses Haar-uniform measure on the orthogonal group $O(n)$ , or permutation matrices to select random submanifolds of fixed dimension embedded in the Stiefel or quotient manifolds (Han et al., 18 May 2025). In randomized tangent subspace descent (RTSD), projectors are drawn to select intra-frame or out-of-frame tangent directions, subject to symmetry and norm-retention conditions ensuring sufficient coverage and convergence (Gutman et al., 2019). Block submanifold updates, as in RSSM, partition coordinates or columns and perform updates on the union of randomly selected blocks, enabling further reduction in per-iteration cost (Cheung et al., 3 Sep 2024).

3. Theoretical Guarantees and Complexity

Theoretical analyses employ smoothness and convexity conditions tailored to the nature of the function and the domain:

Strongly convex quadratic objectives: Random submanifold descent behaves similarly to gradient descent in the early phase with rate $O((1-1/\kappa)^t)$ , where $\kappa$ is the condition number (Pasechnyuk et al., 2023).
For smooth (geodesically convex) manifold domains, randomized subspace selection rules satisfying the $C$ -randomized norm condition yield $O(1/t)$ convergence for expected suboptimality, and $O(1/t)$ stationarity gap decay under $L_f$ -smoothness (Gutman et al., 2019).
Nonsmooth, weakly convex functions over Stiefel manifolds: RSSM attains an iteration complexity $O(\varepsilon^{-4})$ for reaching an $\varepsilon$ -stationarity measure in both expectation and with high probability. This exploits a Riemannian subgradient inequality (mixing weak convexity and Lipschitz bounds) and an adaptive Mahalanobis metric induced by averaging block projections (Cheung et al., 3 Sep 2024).
General Riemannian submanifold descent, for functions satisfying Polyak-Lojasiewicz (PL) conditions, achieves linear convergence after entering a basin around a minimizer. The contraction factor is proportional to the ratio $r(r-1)/n(n-1)$ , where $r$ is the submanifold dimension (Han et al., 18 May 2025).

The per-iteration complexity is $O(n\,d)$ for affine-space sampling, $O(n\,|B|^2)$ or $O(n p^2/\ell)$ for block-matrix manifold updates, and $O(n r^2)$ for permutation-based submanifold sampling on orthogonal manifolds.

4. Design, Hyperparameters, and Practical Considerations

Hyperparameter selection influences search granularity, computational workload, and convergence behavior:

Submanifold dimension ( $d$ or $r$ ): Larger values increase exploration but raise local solve cost; empirical studies suggest $d/n \approx 0.1-0.3$ balances these demands (Pasechnyuk et al., 2023, Han et al., 18 May 2025).
Number of subspace solves ( $N$ ): Sets total budget.
Number of blocks ( $\ell$ ) in block partitioning methods: Smaller $\ell$ yields larger eligible update blocks per iteration in RSSM and TSD (Cheung et al., 3 Sep 2024, Gutman et al., 2019).
Number of probes ( $p$ ): For the Solar method, maintaining a heap of $p$ best points improves escape from local minima.
Step size ( $\eta$ ): Typically chosen via local smoothness, from $1/L_f$ for RTSD to adaptive or diminishing schedules in stochastic settings.

Random submanifold descent methods maintain low per-iteration complexity and can be efficiently parallelized over probes or blocks. The methods can be configured for refinement (small submanifolds, local search) or large jumps (random orientations, larger submanifolds) to escape poor local minima.

5. Generalizations: Manifolds and Nonlinear Constraints

Random submanifold descent extends naturally to problems with nonlinear equality/inequality constraints and non-Euclidean feasible sets.

For nonlinear polynomial constraint systems, triangularization is used to obtain a triangular representation of the constraint manifold, followed by embedding into reduced dimensions via Whitney's embedding theorem. The descent proceeds by sampling random directions in the tangent space of the embedded manifold and employing numerical continuation for projection (“retraction”) onto the feasible set. Under reasonable regularity and Lipschitz conditions, probabilistic descent guarantees almost sure convergence to critical points; the complexity scales as $O(\epsilon^{-2})$ in the unconstrained case (Dreisigmeyer, 2018).

In quotient manifold settings (e.g., Grassmann, flag, Stiefel), random submanifold selection leverages the group action of $O(n)$ : updates are implemented as random restriction/retraction steps on embedded submanifolds, and the analysis carries over due to isometry properties of the action (Han et al., 18 May 2025). The expected decrease and stationarity guarantees are preserved under such generalizations.

6. Empirical Benchmarks and Performance

Random submanifold descent algorithms have been empirically validated on a diverse array of problems:

Convex quadratic minimization (with high condition numbers): Solar outperforms momentum three-point methods and accelerated zeroth-order baselines, with nearly steepest-descent performance under small subspace dimensions (Pasechnyuk et al., 2023).
Unimodal non-convex problems (e.g., Rosenbrock–Skokov): Solar cone-restricted variant matches restarted conjugate gradient, and surpasses unrestarted variants (Pasechnyuk et al., 2023).
Multimodal global optimization (Rastrigin, DeVilliers–Glasser): Solar finds deeper minima than simulated annealing or monotonic basin-hopping under fixed budgets, with reduced sensitivity to hyperparameter choice (Pasechnyuk et al., 2023).
Orthogonal Procrustes and Stiefel-constrained dictionary learning: Block submanifold methods (RTSD, RSSM, RSDM) match or outperform full Riemannian gradient descent and infeasible methods when the problem dimension is large (Gutman et al., 2019, Cheung et al., 3 Sep 2024, Han et al., 18 May 2025).
Training orthogonal feed-forward networks, principal component analysis, and quadratic assignment: RSDM attains faster convergence and reduced wall-clock time for large $p$ , robustly outperforming coordinate descent and retraction-free approaches (Han et al., 18 May 2025).

7. Limitations, Open Questions, and Future Directions

While random submanifold descent offers flexibility and efficiency, several limitations and open problems remain:

Global convergence proofs for heuristic variants (e.g., Solar, unconstrained random search) are lacking, with performance reliant on empirical coverage of subspaces.
Subproblem-solver inexactness may accumulate, necessitating careful budget and tolerance selection (Pasechnyuk et al., 2023).
Efficient computation of projection/retraction operations on general manifolds can be challenging, especially for large-scale problems.
The scaling operator and adaptive metrics in block-manifold methods introduce additional design complexity, and their dynamic properties merit further investigation (Cheung et al., 3 Sep 2024).
Derivation of optimal sampling strategies and principled choices for submanifold/block partitions, guided by problem structure, remains a topic of ongoing research.
Extension of convergence results to nonlinear constraints, arbitrary manifold geometries, and broader classes of nonsmooth objectives is in progress.

Random submanifold descent, encompassing Solar, TSD, RSSM, and RSDM, provides a rigorous and extensible framework for scalable optimization in high dimensions, manifold-constrained settings, and difficult global optimization tasks. Its systematic construction, analytic guarantees, and adaptability continue to expand the design space for randomized optimization algorithms (Pasechnyuk et al., 2023, Gutman et al., 2019, Cheung et al., 3 Sep 2024, Han et al., 18 May 2025, Qu et al., 2014, Dreisigmeyer, 2018).