On the convergence analysis of the decentralized projected gradient descent method (2303.08412v2)
Abstract: In this work, we are concerned with the decentralized optimization problem: \begin{equation*} \min_{x \in \Omega}~f(x) = \frac{1}{n} \sum_{i=1}n f_i (x), \end{equation*} where $\Omega \subset \mathbb{R}d$ is a convex domain and each $f_i : \Omega \rightarrow \mathbb{R}$ is a local cost function only known to agent $i$. A fundamental algorithm is the decentralized projected gradient method (DPG) given by \begin{equation*} x_i(t+1)=\mathcal{P}\Omega\Big[\sumn{j=1}w_{ij} x_j(t) -\alpha(t)\nabla f_i(x_i(t))\Big] \end{equation*} where $\mathcal{P}{\Omega}$ is the projection operator to $\Omega$ and $ {w{ij}}{1\leq i,j \leq n}$ are communication weight among the agents. While this method has been widely used in the literature, its convergence property has not been established so far, except for the special case $\Omega = \mathbb{R}n$. This work establishes new convergence estimates of DPG when the aggregate cost $f$ is strongly convex and each function $f_i$ is smooth. If the stepsize is given by constant $\alpha (t) \equiv\alpha >0$ and suitably small, we prove that each $x_i (t)$ converges to an $O(\sqrt{\alpha})$-neighborhood of the optimal point. In addition, we further improve the convergence result by showing that the point $x_i (t)$ converges to an $O(\alpha)$-neighborhood of the optimal point if the domain is given the half-space $\mathbb{R}{d-1}\times \mathbb{R}{+}$ for any dimension $d\in \mathbb{N}$. Also, we obtain new convergence results for decreasing stepsizes. Numerical experiments are provided to support the convergence results.