Decentralized SCA Momentum-based Prox-Linear (D-SCAMPL)
- The paper introduces D-SCAMPL, which reformulates constrained nonconvex problems using an exact-penalty approach to avoid costly projections.
- It employs successive convex approximation and local prox-linear surrogates with momentum-based variance reduction for efficient decentralized optimization.
- The method achieves optimal sample and communication complexity while ensuring feasibility and consensus across agents in multi-agent networks.
Decentralized SCA Momentum-based Prox-Linear (D-SCAMPL) is a distributed algorithmic framework for consensus-based stochastic optimization in multi-agent networks, targeting nonconvex objectives with convex nonsmooth regularization and complex functional constraints. D-SCAMPL leverages successive convex approximation (SCA), momentum-based variance-reduction, and local prox-linearization to achieve efficient decentralized constrained optimization via only local stochastic gradients and constraint information, without requiring global projections or multiple consensus rounds per iteration (Sharma et al., 28 Jan 2026).
1. Problem Setting and Motivation
D-SCAMPL addresses decentralized stochastic optimization over undirected networks, where agents collectively solve
with denoting a possibly nonconvex smooth objective, a convex nonsmooth regularizer, and smooth convex (possibly nonlinear) inequality constraints. Each agent has access to only local data and first-order oracle calls for and . Explicit projection onto the feasible set is assumed intractable; thus, the algorithm must avoid subproblems requiring such projections.
The central challenge is to design an algorithm that (i) handles stochastic gradients and variance, (ii) ensures feasibility with nonlinear constraints, (iii) is communication-efficient (few rounds per iteration), and (iv) achieves the optimal sample/communication complexity scaling for nonconvex decentralized problems.
2. Algorithmic Framework
D-SCAMPL follows an SCA-proximal linearization approach. The key mechanisms are:
- Exact-Penalty Reformulation: The constrained problem is equivalently rewritten via the penalty function, introducing a penalty parameter and a penalty slack such that
This formulation avoids complex projections and enables subgradient control via max-penalty.
- SCA-Prox-Linear Surrogate: At each iteration and agent , D-SCAMPL constructs a strongly convex surrogate based on linearization of and around the current iterate. The resulting subproblem is a quadratic program with linearized constraints and proximal regularization, solvable efficiently.
- Momentum-Based Variance Reduction: The local gradient estimator is recursively updated as
enabling variance-reduction and momentum effects that accelerate convergence.
- Distributed Consensus: Each iteration involves two rounds of weighted averaging (mixing step via a symmetric, doubly-stochastic matrix ) for both the primal variable and gradient-tracking estimator, ensuring network-wide approximate consensus without centralized coordination.
3. Detailed Per-Iteration Mechanisms
At each iteration and for each agent :
- SCA Prox-Linear Subproblem:
- Formulate and solve
subject to
- Update .
- Consensus Averaging (Primal):
- Variance-Reduced Gradient Update: as above.
- Consensus Averaging (Gradient-Tracking):
Communication per iteration is limited to two rounds of weighted averaging—one for the variable and one for the dual/gradient estimator (Sharma et al., 28 Jan 2026).
4. Convergence Properties and Complexity
D-SCAMPL achieves an oracle (stochastic first-order oracle) and communication complexity of , matching the optimal rate for unconstrained nonconvex centralized stochastic problems under standard smoothness and regularity assumptions. The main convergence theorem ensures that, after iterations (where depends on spectral gap of and is gradient variance), the output is an -KKT approximate solution:
- Consensus error ,
- Proximity (for inexact subproblems),
- Stationarity and feasibility (high probability): , .
No agent needs projections onto nonlinear constraints; only local quadratic programs with linearized constraints are solved per step, and communication is minimized (Sharma et al., 28 Jan 2026).
5. Comparison to Related Methods
D-SCAMPL extends and outperforms prior decentralized stochastic composite optimization techniques:
| Algorithm | Constraints | Consensus | Regularizer | Sample Complexity | Communication |
|---|---|---|---|---|---|
| D-SCAMPL | Nonlinear | Two rounds | Nonsmooth (h(x)) | ||
| DEEPSTORM (Mancino-Ball et al., 2022) | None | Two rounds | Nonsmooth (r(x)) | ||
| D-PSGD [14] | None | One round | Smooth only | ||
| D-MSSCA | Feasibility | Multi-round | Smooth, SCA | Higher |
D-SCAMPL is the only approach to date that efficiently handles nonlinear constraints and nonsmooth regularization in the decentralized nonconvex stochastic regime without requiring expensive global projections or numerous consensus steps per iteration. In contrast, methods such as DEEPSTORM (Mancino-Ball et al., 2022) target composite but unconstrained problems, while D-MSSCA and similar SCA-based approaches require more communication or complex constraint-handling.
6. Empirical Results and Practical Considerations
Numerical experiments conducted on energy-optimal ocean trajectory planning—involving four unmanned surface vehicles (USVs) coordinated by networked users and governed by complex stochastic ocean-current models—demonstrate robust performance:
- D-SCAMPL (and its base variant D-SMPL) attain KKT-residual and constraint violation decay rates matching the theoretical complexity.
- Iterations impose only small quadratic subproblems (from linearized constraints), improving per-iteration runtime.
- Only two consensus rounds per iteration are necessary, resulting in 2–5× lower wall-clock times to within a given feasibility/objective threshold compared to existing constrained decentralized baselines.
- No feasibility projections are required during optimization (Sharma et al., 28 Jan 2026).
A plausible implication is that D-SCAMPL provides practical scalability for distributed learning and control in networked systems subject to realistic nonlinear constraints.
7. Summary and Outlook
D-SCAMPL constitutes a significant advance in decentralized constrained nonconvex stochastic optimization. Through exact-penalty reformulation, momentum-variance-reduction, and local prox-linear surrogate minimization—combined with SCA methodology—it achieves optimal complexity rates, low communication load, and broad applicability to real-world decentralized learning under general constraint structures. By operating entirely via local first-order information and gradient tracking, it circumvents the primary communication and projection bottlenecks associated with prior art (Sharma et al., 28 Jan 2026).
The bifurcation between D-SCAMPL and methods such as DEEPSTORM (Mancino-Ball et al., 2022) indicates that future research may further unify advances in decentralized composite, constrained, and communication-efficient stochastic optimization.