Training guarantees for algorithm-unrolled networks

Develop optimization and learning algorithms with theoretical assurance to compute the parameters of algorithm-unrolled networks by solving the bi-level optimization defined by the lower-level iterative scheme x^{(i)} = T(y, x^{(i-1)}; θ^{(i)}) (Equation \eqref{eq:model-unroll}) and the upper-level empirical loss minimization L(θ) = \sum_{j=1}^{N} c_j · ℓ(G(y_j; θ), x_j^*) (Equation \eqref{eq:unrolling-train-loss}). Specifically, prove optimality guarantees for the trained parameters and characterize conditions under which training converges to optimal solutions of the upper-level problem.

Background

Algorithm unrolling treats iterative optimization algorithms as deep neural networks and trains their parameters to achieve acceleration on specific problem distributions. The training objective is typically the empirical loss over a dataset (Equation \eqref{eq:unrolling-train-loss}), while the network’s forward dynamics are defined by the unrolled iterative updates (Equation \eqref{eq:model-unroll}), forming a bi-level optimization problem.

While existence results can show that properly chosen parameters may yield superior convergence rates, a key unresolved aspect is whether there are training algorithms that provably compute such parameters with optimality guarantees, including convergence and correctness of the learned solution. The authors highlight the absence of complete theoretical guarantees for this bi-level training in the context of unrolled networks.

References

Although complete results with the optimality guarantee still lack, some studies like identify some theoretical properties of the gradient of the loss function.

— Learning to optimize: A tutorial for continuous and mixed-integer optimization (2405.15251 - Chen et al., 24 May 2024) in Training paragraph, Section 3.5 (Mathematics behind algorithm unrolling)

Training guarantees for algorithm-unrolled networks

Background

References

Related Problems