Optimal choice of the CHB censoring threshold ε1

Determine a theoretically optimal value of the censoring threshold parameter ε1 used in the Censoring-based Heavy Ball (CHB) skip-transmission condition ‖δ∇m^k‖^2 ≤ ε1‖θ^k − θ^{k−1}‖^2, which governs the communication–iteration trade-off and tunes convergence while saving communications in the CHB method.

Background

The CHB algorithm reduces worker-to-server communications by allowing each worker to skip transmitting its gradient update when the change in its local gradient is small relative to the recent model update. This is enforced via the CHB-skip-transmission condition ‖δ∇mk‖2 ≤ ε1‖θk − θ{k−1}‖2, where ε1 is a positive constant.

The paper explains that smaller ε1 yields less censoring (more communications), while larger ε1 increases censoring (fewer communications) at the cost of more iterations, enabling a communication–iteration trade-off. Identifying a theoretically optimal ε1 would formalize and improve this tuning to maximize efficiency without degrading convergence.

References

Finding a theoretically optimal value of ε1 is an interesting open problem.

Communication-Efficient {Federated} Learning Using Censored Heavy Ball Descent  (2209.11944 - Chen et al., 2022) in Section 2 (Censoring-based heavy ball method), immediately after the CHB-skip-transmission condition