A First-Order Multi-Gradient Algorithm for Multi-Objective Bi-Level Optimization (2401.09257v2)
Abstract: In this paper, we study the Multi-Objective Bi-Level Optimization (MOBLO) problem, where the upper-level subproblem is a multi-objective optimization problem and the lower-level subproblem is for scalar optimization. Existing gradient-based MOBLO algorithms need to compute the Hessian matrix, causing the computational inefficient problem. To address this, we propose an efficient first-order multi-gradient method for MOBLO, called FORUM. Specifically, we reformulate MOBLO problems as a constrained multi-objective optimization (MOO) problem via the value-function approach. Then we propose a novel multi-gradient aggregation method to solve the challenging constrained MOO problem. Theoretically, we provide the complexity analysis to show the efficiency of the proposed method and a non-asymptotic convergence result. Empirically, extensive experiments demonstrate the effectiveness and efficiency of the proposed FORUM method in different learning problems. In particular, it achieves state-of-the-art performance on three multi-task learning benchmark datasets. The code is available at https://github.com/Baijiong-Lin/FORUM.
- Forward and reverse gradient-based hyperparameter optimization. In International Conference on Machine Learning, 2017.
- Multi-objective meta learning. In Neural Information Processing Systems, 2021.
- Efficient multi-objective neural architecture search via lamarckian evolution. arXiv preprint arXiv:1804.09081, 2018.
- Nsganetv2: Evolutionary multi-objective surrogate-assisted neural architecture search. In European Conference Computer Vision, 2020.
- Multi-objective search of robust neural architectures against multiple types of adversarial attacks. Neurocomputing, 453:73–84, 2021.
- Effective, efficient and robust neural architecture search. In International Joint Conference on Neural Networks, 2022.
- Meta-learning for multi-objective reinforcement learning. In IEEE/RSJ International Conference on Intelligent Robots and Systems, 2019.
- A generalized algorithm for multi-objective reinforcement learning and policy adaptation. In Neural Information Processing Systems, 2019.
- A distributional view on multi-objective policy optimization. In International Conference on Machine Learning, 2020.
- Metaweighting: Learning to weight tasks in multi-task learning. In Findings of the Association for Computational Linguistics, 2022.
- Enhancing meta learning via multi-objective soft improvement functions. In International Conference on Learning Representations, 2023.
- Mitigating gradient bias in multi-objective learning: A provably convergent approach. In International Conference on Learning Representations, 2023.
- Gradient-based hyperparameter optimization through reversible learning. In International Conference on Machine Learning, 2015.
- Bilevel programming for hyperparameter optimization and meta-learning. In International Conference on Machine Learning, 2018.
- On the iteration complexity of hypergradient computation. In International Conference on Machine Learning, 2020.
- A value-function-based interior-point method for non-convex bi-level optimization. In International Conference on Machine Learning, 2021.
- Bome! bilevel optimization made easy: A simple first-order approach. In Neural Information Processing Systems, 2022.
- A constrained optimization approach to bilevel optimization with multiple inner minima. arXiv preprint arXiv:2203.01123, 2022.
- Daniel Angus. Crowding population-based ant colony optimisation for the multi-objective travelling salesman problem. In IEEE Symposium on Computational Intelligence in Multi-Criteria Decision-Making, 2007.
- Multiobjective evolutionary algorithms: A survey of the state of the art. Swarm and evolutionary computation, 1(1):32–49, 2011.
- Jean-Antoine Désidéri. Multiple-gradient descent algorithm (MGDA) for multiobjective optimization. Comptes Rendus Mathematique, 350(5):313–318, 2012.
- Multi-task learning with user preferences: Gradient descent with controlled ascent in pareto optimization. In International Conference on Machine Learning, 2020.
- On the convergence of stochastic multi-objective gradient manipulation and beyond. In Neural Information Processing Systems, 2022.
- Multi-task learning as multi-objective optimization. In Neural Information Processing Systems, 2018.
- Gradient surgery for multi-task learning. In Neural Information Processing Systems, 2020.
- Investigating bi-level optimization for learning and vision from a unified perspective: A survey and beyond. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(12):10045–10067, 2021.
- Min-max multi-objective bilevel optimization with applications in robust machine learning. In International Conference on Learning Representations, 2023.
- A fully first-order method for stochastic bilevel optimization. In International Conference on Machine Learning, 2023.
- On bilevel optimization without lower-level strong convexity. arXiv preprint arXiv:2301.00712, 2023.
- A conditional gradient-based method for simple bilevel optimization with convex lower-level problem. In International Conference on Artificial Intelligence and Statistics, 2023.
- Automatic and harmless regularization with constrained and lexicographic optimization: A dynamic barrier approach. In Neural Information Processing Systems, 2021.
- CVXPY: A Python-embedded modeling language for convex optimization. Journal of Machine Learning Research, 17(83):1–5, 2016.
- Stability and generalization of bilevel programming in hyperparameter optimization. In Neural Information Processing Systems, 2021.
- A general descent aggregation framework for gradient-based bi-level optimization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(1):38–57, 2022.
- Truncated back-propagation for bilevel optimization. In International Conference on Artificial Intelligence and Statistics, 2019.
- Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998.
- Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747, 2017.
- Rich Caruana. Multitask learning. Machine learning, 28(1):41–75, 1997.
- Yu Zhang and Qiang Yang. A survey on multi-task learning. IEEE Transactions on Knowledge and Data Engineering, 34(12):5586–5609, 2022.
- Adapting visual category models to new domains. In European Conference on Computer Vision, 2010.
- Indoor segmentation and support inference from rgbd images. In European Conference on Computer Vision, 2012.
- Reasonable effectiveness of random weighting: A litmus test for multi-task learning. Transactions on Machine Learning Research, 2022.
- Quantum chemistry structures and properties of 134 kilo molecules. Scientific Data, 1(1):1–7, 2014.
- Fast graph representation learning with pytorch geometric. arXiv preprint arXiv:1903.02428, 2019.
- Multi-task learning as a bargaining game. In International Conference on Machine Learning, 2022.
- Multi-task learning using uncertainty to weigh losses for scene geometry and semantics. In IEEE Conference on Computer Vision and Pattern Recognition, 2018.
- Just pick a sign: Optimizing deep multitask models with gradient sign dropout. In Neural Information Processing Systems, 2020.
- Gradient vaccine: Investigating and improving multi-task optimization in massively multilingual models. In International Conference on Learning Representations, 2021.
- Conflict-averse gradient descent for multi-task learning. In Neural Information Processing Systems, 2021.
- End-to-end multi-task learning with attention. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019.
- Attentive single-tasking of multiple tasks. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019.
- An approximate strong kkt condition for multiobjective optimization. TOP, 26(3):489–509, 2018.
- Rich Caruana. Multitask learning: A knowledge-based source of inductive bias. In International Conference on Machine Learning, 1993.
- PyTorch: An imperative style, high-performance deep learning library. In Neural Information Processing Systems, 2019.
- Baijiong Lin and Yu Zhang. LibMTL: A Python library for multi-task learning. Journal of Machine Learning Research, 24(209):1–7, 2023.
- Adam: A method for stochastic optimization. In International Conference on Learning Representations, 2015.
- Encoder-decoder with atrous separable convolution for semantic image segmentation. In European Conference on Computer Vision, 2018.
- Neural message passing for quantum chemistry. In International Conference on Machine Learning, 2017.