Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
167 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A First-Order Multi-Gradient Algorithm for Multi-Objective Bi-Level Optimization (2401.09257v2)

Published 17 Jan 2024 in cs.LG

Abstract: In this paper, we study the Multi-Objective Bi-Level Optimization (MOBLO) problem, where the upper-level subproblem is a multi-objective optimization problem and the lower-level subproblem is for scalar optimization. Existing gradient-based MOBLO algorithms need to compute the Hessian matrix, causing the computational inefficient problem. To address this, we propose an efficient first-order multi-gradient method for MOBLO, called FORUM. Specifically, we reformulate MOBLO problems as a constrained multi-objective optimization (MOO) problem via the value-function approach. Then we propose a novel multi-gradient aggregation method to solve the challenging constrained MOO problem. Theoretically, we provide the complexity analysis to show the efficiency of the proposed method and a non-asymptotic convergence result. Empirically, extensive experiments demonstrate the effectiveness and efficiency of the proposed FORUM method in different learning problems. In particular, it achieves state-of-the-art performance on three multi-task learning benchmark datasets. The code is available at https://github.com/Baijiong-Lin/FORUM.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (58)
  1. Forward and reverse gradient-based hyperparameter optimization. In International Conference on Machine Learning, 2017.
  2. Multi-objective meta learning. In Neural Information Processing Systems, 2021.
  3. Efficient multi-objective neural architecture search via lamarckian evolution. arXiv preprint arXiv:1804.09081, 2018.
  4. Nsganetv2: Evolutionary multi-objective surrogate-assisted neural architecture search. In European Conference Computer Vision, 2020.
  5. Multi-objective search of robust neural architectures against multiple types of adversarial attacks. Neurocomputing, 453:73–84, 2021.
  6. Effective, efficient and robust neural architecture search. In International Joint Conference on Neural Networks, 2022.
  7. Meta-learning for multi-objective reinforcement learning. In IEEE/RSJ International Conference on Intelligent Robots and Systems, 2019.
  8. A generalized algorithm for multi-objective reinforcement learning and policy adaptation. In Neural Information Processing Systems, 2019.
  9. A distributional view on multi-objective policy optimization. In International Conference on Machine Learning, 2020.
  10. Metaweighting: Learning to weight tasks in multi-task learning. In Findings of the Association for Computational Linguistics, 2022.
  11. Enhancing meta learning via multi-objective soft improvement functions. In International Conference on Learning Representations, 2023.
  12. Mitigating gradient bias in multi-objective learning: A provably convergent approach. In International Conference on Learning Representations, 2023.
  13. Gradient-based hyperparameter optimization through reversible learning. In International Conference on Machine Learning, 2015.
  14. Bilevel programming for hyperparameter optimization and meta-learning. In International Conference on Machine Learning, 2018.
  15. On the iteration complexity of hypergradient computation. In International Conference on Machine Learning, 2020.
  16. A value-function-based interior-point method for non-convex bi-level optimization. In International Conference on Machine Learning, 2021.
  17. Bome! bilevel optimization made easy: A simple first-order approach. In Neural Information Processing Systems, 2022.
  18. A constrained optimization approach to bilevel optimization with multiple inner minima. arXiv preprint arXiv:2203.01123, 2022.
  19. Daniel Angus. Crowding population-based ant colony optimisation for the multi-objective travelling salesman problem. In IEEE Symposium on Computational Intelligence in Multi-Criteria Decision-Making, 2007.
  20. Multiobjective evolutionary algorithms: A survey of the state of the art. Swarm and evolutionary computation, 1(1):32–49, 2011.
  21. Jean-Antoine Désidéri. Multiple-gradient descent algorithm (MGDA) for multiobjective optimization. Comptes Rendus Mathematique, 350(5):313–318, 2012.
  22. Multi-task learning with user preferences: Gradient descent with controlled ascent in pareto optimization. In International Conference on Machine Learning, 2020.
  23. On the convergence of stochastic multi-objective gradient manipulation and beyond. In Neural Information Processing Systems, 2022.
  24. Multi-task learning as multi-objective optimization. In Neural Information Processing Systems, 2018.
  25. Gradient surgery for multi-task learning. In Neural Information Processing Systems, 2020.
  26. Investigating bi-level optimization for learning and vision from a unified perspective: A survey and beyond. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(12):10045–10067, 2021.
  27. Min-max multi-objective bilevel optimization with applications in robust machine learning. In International Conference on Learning Representations, 2023.
  28. A fully first-order method for stochastic bilevel optimization. In International Conference on Machine Learning, 2023.
  29. On bilevel optimization without lower-level strong convexity. arXiv preprint arXiv:2301.00712, 2023.
  30. A conditional gradient-based method for simple bilevel optimization with convex lower-level problem. In International Conference on Artificial Intelligence and Statistics, 2023.
  31. Automatic and harmless regularization with constrained and lexicographic optimization: A dynamic barrier approach. In Neural Information Processing Systems, 2021.
  32. CVXPY: A Python-embedded modeling language for convex optimization. Journal of Machine Learning Research, 17(83):1–5, 2016.
  33. Stability and generalization of bilevel programming in hyperparameter optimization. In Neural Information Processing Systems, 2021.
  34. A general descent aggregation framework for gradient-based bi-level optimization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(1):38–57, 2022.
  35. Truncated back-propagation for bilevel optimization. In International Conference on Artificial Intelligence and Statistics, 2019.
  36. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998.
  37. Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747, 2017.
  38. Rich Caruana. Multitask learning. Machine learning, 28(1):41–75, 1997.
  39. Yu Zhang and Qiang Yang. A survey on multi-task learning. IEEE Transactions on Knowledge and Data Engineering, 34(12):5586–5609, 2022.
  40. Adapting visual category models to new domains. In European Conference on Computer Vision, 2010.
  41. Indoor segmentation and support inference from rgbd images. In European Conference on Computer Vision, 2012.
  42. Reasonable effectiveness of random weighting: A litmus test for multi-task learning. Transactions on Machine Learning Research, 2022.
  43. Quantum chemistry structures and properties of 134 kilo molecules. Scientific Data, 1(1):1–7, 2014.
  44. Fast graph representation learning with pytorch geometric. arXiv preprint arXiv:1903.02428, 2019.
  45. Multi-task learning as a bargaining game. In International Conference on Machine Learning, 2022.
  46. Multi-task learning using uncertainty to weigh losses for scene geometry and semantics. In IEEE Conference on Computer Vision and Pattern Recognition, 2018.
  47. Just pick a sign: Optimizing deep multitask models with gradient sign dropout. In Neural Information Processing Systems, 2020.
  48. Gradient vaccine: Investigating and improving multi-task optimization in massively multilingual models. In International Conference on Learning Representations, 2021.
  49. Conflict-averse gradient descent for multi-task learning. In Neural Information Processing Systems, 2021.
  50. End-to-end multi-task learning with attention. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019.
  51. Attentive single-tasking of multiple tasks. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019.
  52. An approximate strong kkt condition for multiobjective optimization. TOP, 26(3):489–509, 2018.
  53. Rich Caruana. Multitask learning: A knowledge-based source of inductive bias. In International Conference on Machine Learning, 1993.
  54. PyTorch: An imperative style, high-performance deep learning library. In Neural Information Processing Systems, 2019.
  55. Baijiong Lin and Yu Zhang. LibMTL: A Python library for multi-task learning. Journal of Machine Learning Research, 24(209):1–7, 2023.
  56. Adam: A method for stochastic optimization. In International Conference on Learning Representations, 2015.
  57. Encoder-decoder with atrous separable convolution for semantic image segmentation. In European Conference on Computer Vision, 2018.
  58. Neural message passing for quantum chemistry. In International Conference on Machine Learning, 2017.
Citations (4)

Summary

We haven't generated a summary for this paper yet.