Block Coordinate Descent Methods
- Block coordinate descent is an optimization method that partitions variables into blocks and updates one block at a time to efficiently tackle high-dimensional problems.
- It uses various selection schemes, such as cyclic, randomized, and greedy, to strategically choose which block to optimize based on problem structure.
- The method supports flexible update strategies including exact minimization and gradient-based steps, making it effective for both convex and nonconvex settings.
Block coordinate descent (BCD) is a fundamental algorithmic paradigm for high-dimensional optimization problems, wherein variables are partitioned into blocks and optimization proceeds by successively updating one or several blocks while keeping the others fixed. By decomposing large-scale, often structured objectives into tractable subproblems, BCD achieves scalability, memory efficiency, and the ability to exploit problem structure across a wide range of convex and nonconvex settings.
1. Fundamental Principles and Variants
The defining characteristic of BCD is the block-wise update mechanism. For a problem of the form
the method iterates by selecting a block at step and performing a block-specific update, e.g.,
with various rules to select the block and perform the update (exact minimization, gradient, Newton, or proximal steps).
Block selection schemes include:
- Cyclic: Scan through blocks in a fixed order.
- Randomized: Sample blocks i.i.d. or using Markov chains (Sun et al., 2018).
- Greedy: Select the block expected to yield maximal decrease, e.g., Gauss-Southwell-type rules (Nutini et al., 2017).
- Flexible/Variable: Priority or arbitrary deterministic scheduling subject to regular access (e.g., K-cyclic) (Briceño-Arias et al., 30 Oct 2025).
Block updates can be:
- Exact minimization on the