Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Robust Stochastically-Descending Unrolled Networks (2312.15788v2)

Published 25 Dec 2023 in cs.LG and eess.SP

Abstract: Deep unrolling, or unfolding, is an emerging learning-to-optimize method that unrolls a truncated iterative algorithm in the layers of a trainable neural network. However, the convergence guarantees and generalizability of the unrolled networks are still open theoretical problems. To tackle these problems, we provide deep unrolled architectures with a stochastic descent nature by imposing descending constraints during training. The descending constraints are forced layer by layer to ensure that each unrolled layer takes, on average, a descent step toward the optimum during training. We theoretically prove that the sequence constructed by the outputs of the unrolled layers is then guaranteed to converge for unseen problems, assuming no distribution shift between training and test problems. We also show that standard unrolling is brittle to perturbations, and our imposed constraints provide the unrolled networks with robustness to additive noise and perturbations. We numerically assess unrolled architectures trained under the proposed constraints in two different applications, including the sparse coding using learnable iterative shrinkage and thresholding algorithm (LISTA) and image inpainting using proximal generative flow (GLOW-Prox), and demonstrate the performance and robustness benefits of the proposed method.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (52)
  1. K. Gregor and Y. LeCun, “Learning fast approximations of sparse coding,” in Proceedings of the 27th International Conference on International Conference on Machine Learning, ICML’10, pp. 399–406, June 2010.
  2. Y. Yang, D. P. Wipf, et al., “Transformers from an optimization perspective,” Advances in Neural Information Processing Systems, vol. 35, pp. 36958–36971, 2022.
  3. Y. Yu, S. Buchanan, D. Pai, T. Chu, Z. Wu, S. Tong, B. D. Haeffele, and Y. Ma, “White-box transformers via sparse rate reduction,” arXiv preprint arXiv:2306.01129, 2023.
  4. J. Von Oswald, E. Niklasson, E. Randazzo, J. Sacramento, A. Mordvintsev, A. Zhmoginov, and M. Vladymyrov, “Transformers learn in-context by gradient descent,” in International Conference on Machine Learning, pp. 35151–35174, PMLR, 2023.
  5. V. Monga, Y. Li, and Y. C. Eldar, “Algorithm unrolling: Interpretable, efficient deep learning for signal and image processing,” IEEE Signal Processing Magazine, vol. 38, pp. 18–44, Mar. 2021.
  6. K. Zhang, L. V. Gool, and R. Timofte, “Deep unfolding network for image super-resolution,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 3217–3226, 2020.
  7. X. Wei, H. van Gorp, L. Gonzalez-Carabarin, D. Freedman, Y. C. Eldar, and R. J. G. van Sloun, “Deep unfolding with normalizing flow priors for inverse problems,” IEEE Transactions on Signal Processing, vol. 70, pp. 2962–2971, 2022.
  8. C. Mou, Q. Wang, and J. Zhang, “Deep generalized unfolding networks for image restoration,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 17399–17410, 2022.
  9. Y. Li, M. Tofighi, J. Geng, V. Monga, and Y. C. Eldar, “Efficient and interpretable deep blind image deblurring via algorithm unrolling,” IEEE Transactions on Computational Imaging, vol. 6, pp. 666–681, 2020.
  10. P. Qiao, S. Liu, T. Sun, K. Yang, and Y. Dou, “Towards vision transformer unrolling fixed-point algorithm: a case study on image restoration,” arXiv preprint arXiv:2301.12332, 2023.
  11. Q. Hu, Y. Cai, Q. Shi, K. Xu, G. Yu, and Z. Ding, “Iterative algorithm induced deep-unfolding neural networks: Precoding design for multiuser mimo systems,” IEEE Transactions on Wireless Communications, vol. 20, no. 2, pp. 1394–1410, 2020.
  12. A. Chowdhury, G. Verma, C. Rao, A. Swami, and S. Segarra, “Unfolding wmmse using graph neural networks for efficient power allocation,” IEEE Transactions on Wireless Communications, vol. 20, no. 9, pp. 6004–6017, 2021.
  13. Y. Liu, Q. Hu, Y. Cai, G. Yu, and G. Y. Li, “Deep-unfolding beamforming for intelligent reflecting surface assisted full-duplex systems,” IEEE Transactions on Wireless Communications, vol. 21, no. 7, pp. 4784–4800, 2021.
  14. L. Schynol and M. Pesavento, “Coordinated sum-rate maximization in multicell mu-mimo with deep unrolling,” IEEE Journal on Selected Areas in Communications, vol. 41, no. 4, pp. 1120–1134, 2023.
  15. H. Huang, Y. Lin, G. Gui, H. Gacanin, H. Sari, and F. Adachi, “Regularization strategy aided robust unsupervised learning for wireless resource allocation,” IEEE Transactions on Vehicular Technology, 2023.
  16. H. Yang, N. Cheng, R. Sun, W. Quan, R. Chai, K. Aldubaikhy, A. Alqasir, and X. Shen, “Knowledge-driven resource allocation for d2d networks: A wmmse unrolled graph neural network approach,” arXiv preprint arXiv:2307.05882, 2023.
  17. Y. Li, O. Bar-Shira, V. Monga, and Y. C. Eldar, “Deep algorithm unrolling for biomedical imaging,” arXiv preprint arXiv:2108.06637, 2021.
  18. U. Nakarmi, J. Y. Cheng, E. P. Rios, M. Mardani, J. M. Pauly, L. Ying, and S. S. Vasanawala, “Multi-scale unrolled deep learning framework for accelerated magnetic resonance imaging,” in 2020 IEEE 17th International Symposium on Biomedical Imaging (ISBI), pp. 1056–1059, IEEE, 2020.
  19. N. Chennakeshava, T. S. Stevens, F. J. de Bruijn, A. Hancock, M. Pekař, Y. C. Eldar, M. Mischi, and R. J. van Sloun, “Deep proximal unfolding for image recovery from under-sampled channel data in intravascular ultrasound,” in ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1221–1225, IEEE, 2022.
  20. H. Wang, Y. Li, H. Zhang, D. Meng, and Y. Zheng, “Indudonet+: A deep unfolding dual domain network for metal artifact reduction in ct images,” Medical Image Analysis, vol. 85, p. 102729, 2023.
  21. S. Hadou, N. NaderiAlizadeh, and A. Ribeiro, “Stochastic unrolled federated learning,” arXiv preprint arXiv:2305.15371, 2023.
  22. S. Ravi and H. Larochelle, “Optimization as a model for few-shot learning,” in International Conference on Learning Representations, 2016.
  23. J. R. Hershey, J. L. Roux, and F. Weninger, “Deep unfolding: Model-based inspiration of novel deep architectures,” arXiv preprint arXiv:1409.2574, 2014.
  24. R. Nasser, Y. C. Eldar, and R. Sharan, “Deep unfolding for non-negative matrix factorization with application to mutational signature analysis,” Journal of Computational Biology, vol. 29, no. 1, pp. 45–55, 2022.
  25. Y. Noah and N. Shlezinger, “Limited communications distributed optimization via deep unfolded distributed admm,” arXiv preprint arXiv:2309.14353, 2023.
  26. C. Liu, G. Leus, and E. Isufi, “Unrolling of simplicial elasticnet for edge flow signal reconstruction,” IEEE Open Journal of Signal Processing, pp. 1–9, 2023.
  27. H. Heaton, X. Chen, Z. Wang, and W. Yin, “Safeguarded learned convex optimization,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, pp. 7848–7855, Jun. 2023.
  28. J. Shen, X. Chen, H. Heaton, T. Chen, J. Liu, W. Yin, and Z. Wang, “Learning a minimax optimizer: A pilot study,” in International Conference on Learning Representations, 2021.
  29. M. Moeller, T. Mollenhoff, and D. Cremers, “Controlling neural networks via energy dissipation,” in Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October 2019.
  30. R. Liu, P. Mu, and J. Zhang, “Investigating customization strategies and convergence behaviors of task-specific admm,” IEEE Transactions on Image Processing, vol. 30, pp. 8278–8292, 2021.
  31. X. Chen, J. Liu, Z. Wang, and W. Yin, “Theoretical linear convergence of unfolded ista and its practical weights and thresholds,” in Advances in Neural Information Processing Systems, vol. 31, 2018.
  32. J. Liu, X. Chen, Z. Wang, and W. Yin, “ALISTA: Analytic weights are as good as learned weights in LISTA,” in International Conference on Learning Representations, 2019.
  33. X. Xie, J. Wu, G. Liu, Z. Zhong, and Z. Lin, “Differentiable linearized ADMM,” in Proceedings of the 36th International Conference on Machine Learning, vol. 97 of Proceedings of Machine Learning Research, pp. 6902–6911, PMLR, 09–15 Jun 2019.
  34. M. Abadi, A. Chu, I. Goodfellow, H. B. McMahan, I. Mironov, K. Talwar, and L. Zhang, “Deep learning with differential privacy,” in Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, CCS ’16, p. 308–318, 2016.
  35. S. Hadou, C. I. Kanatsoulis, and A. Ribeiro, “Space-time graph neural networks with stochastic graph perturbations,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5, 2023.
  36. S. Hadou, C. I. Kanatsoulis, and A. Ribeiro, “Space-time graph neural networks,” in International Conference on Learning Representations (ICLR), 2022.
  37. F. Gama, J. Bruna, and A. Ribeiro, “Stability properties of graph neural networks,” IEEE Transactions on Signal Processing, vol. 68, pp. 5680–5695, 2020.
  38. M. Andrychowicz, M. Denil, S. G. Colmenarejo, M. W. Hoffman, D. Pfau, T. Schaul, B. Shillingford, and N. de Freitas, “Learning to learn by gradient descent by gradient descent,” in Proceedings of the 30th International Conference on Neural Information Processing Systems, p. 3988–3996, 2016.
  39. R. Liu, X. Liu, S. Zeng, J. Zhang, and Y. Zhang, “Optimization-derived learning with essential convergence analysis of training and hyper-training,” in International Conference on Machine Learning, pp. 13825–13856, PMLR, 2022.
  40. L. F. Chamon, S. Paternain, M. Calvo-Fullana, and A. Ribeiro, “Constrained learning with non-convex losses,” IEEE Transactions on Information Theory, 2022.
  41. S. P. Boyd and L. Vandenberghe, Convex optimization. Cambridge university press, 2004.
  42. I. Daubechies, M. Defrise, and C. De Mol, “An iterative thresholding algorithm for linear inverse problems with a sparsity constraint,” Communications on Pure and Applied Mathematics, vol. 57, no. 11, pp. 1413–1457, 2004.
  43. A. Beck and M. Teboulle, “A fast iterative shrinkage-thresholding algorithm with application to wavelet-based image deblurring,” in 2009 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 693–696, Apr. 2009.
  44. H. Chung, S. J. Lee, and J. G. Park, “Deep neural network using trainable activation functions,” in 2016 International Joint Conference on Neural Networks (IJCNN), pp. 348–352, 2016.
  45. S. Kiliçarslan and M. Celik, “Rsigelu: A nonlinear activation function for deep neural networks,” Expert Systems with Applications, vol. 174, p. 114805, 2021.
  46. M. Varshney and P. Singh, “Optimizing nonlinear activation function for convolutional neural networks,” Signal, Image and Video Processing, vol. 15, no. 6, pp. 1323–1330, 2021.
  47. D. P. Kingma and P. Dhariwal, “Glow: Generative flow with invertible 1x1 convolutions,” Advances in neural information processing systems, vol. 31, 2018.
  48. M. Asim, M. Daniels, O. Leong, A. Ahmed, and P. Hand, “Invertible generative models for inverse problems: mitigating representation error and dataset bias,” in International Conference on Machine Learning, pp. 399–409, PMLR, 2020.
  49. J. Whang, Q. Lei, and A. Dimakis, “Solving inverse problems with a flow-based noise model,” in International Conference on Machine Learning, pp. 11146–11157, PMLR, 2021.
  50. T. Karras, T. Aila, S. Laine, and J. Lehtinen, “Progressive growing of gans for improved quality, stability, and variation,” arXiv preprint arXiv:1710.10196, 2017.
  51. H. Robbins and D. Siegmund, “A convergence theorem for non negative almost supermartingales and some applications,” in Optimizing Methods in Statistics, pp. 233–257, Academic Press, Jan. 1971.
  52. Cambridge university press, 2019.
Citations (1)

Summary

We haven't generated a summary for this paper yet.