Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Smooth Tchebycheff Scalarization for Multi-Objective Optimization (2402.19078v3)

Published 29 Feb 2024 in cs.LG, cs.AI, cs.NE, and math.OC

Abstract: Multi-objective optimization problems can be found in many real-world applications, where the objectives often conflict each other and cannot be optimized by a single solution. In the past few decades, numerous methods have been proposed to find Pareto solutions that represent optimal trade-offs among the objectives for a given problem. However, these existing methods could have high computational complexity or may not have good theoretical properties for solving a general differentiable multi-objective optimization problem. In this work, by leveraging the smooth optimization technique, we propose a lightweight and efficient smooth Tchebycheff scalarization approach for gradient-based multi-objective optimization. It has good theoretical properties for finding all Pareto solutions with valid trade-off preferences, while enjoying significantly lower computational complexity compared to other methods. Experimental results on various real-world application problems fully demonstrate the effectiveness of our proposed method.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (76)
  1. Nonlinear mixed-discrete structural optimization. Journal of Structural Engineering, 115(3):626–646, 1989.
  2. Smoothing and first order methods: A unified framework. SIAM Journal on Optimization, 22(2):557–580, 2012.
  3. Bowman, V. J. On the relationship of the tchebycheff norm and the efficient frontier of multiple-criteria objectives. In Multiple criteria decision making, pp.  76–86. Springer, 1976.
  4. Convex optimization. Cambridge university press, 2004.
  5. Three-way trade-off in multi-objective learning: Optimization, generalization and conflict-avoidance. In Advances in Neural Information Processing Systems (NeurIPS), 2023.
  6. Multi-objective deep learning with adaptive reference vectors. Advances in Neural Information Processing Systems (NeurIPS), 35:32723–32735, 2022.
  7. Chen, X. Smoothing methods for nonsmooth, nonconvex minimization. Mathematical programming, 134:71–99, 2012.
  8. Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In Proceedings of the 35th International Conference on Machine Learning, pp.  794–803, 2018.
  9. Just pick a sign: Optimizing deep multitask models with gradient sign dropout. Advances in Neural Information Processing Systems, 33:2039–2050, 2020.
  10. Generalized center method for multiobjective engineering optimization. Engineering Optimization, 31(5):641–661, 1999.
  11. Multinet++: Multi-stream feature aggregation and geometric loss strategy for multi-task learning. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
  12. Proper efficiency in nonconvex multicriteria programming. Mathematics of Operations Research, 8(3):467–470, 1983.
  13. Improvable gap balancing for multi-task learning. In Uncertainty in Artificial Intelligence, pp.  496–506. PMLR, 2023.
  14. A closer look at drawbacks of minimizing weighted sums of objectives for pareto set generation in multicriteria optimization problems. Structural Optimization, 14(1):63–69, 1997.
  15. Innovization: Innovating design principles through optimization. In Proceedings of the 8th annual conference on Genetic and evolutionary computation, pp.  1629–1636, 2006.
  16. Désidéri, J.-A. Mutiple-gradient descent algorithm for multiobjective optimization. In European Congress on Computational Methods in Applied Sciences and Engineering (ECCOMAS 2012), 2012.
  17. You only train once: Loss-conditional training of deep networks. International Conference on Learning Representations (ICLR), 2019.
  18. Ehrgott, M. Multicriteria optimization, volume 491. Springer Science & Business Media, 2005.
  19. Mitigating gradient bias in multi-objective learning: A provably convergent approach. In International Conference on Learning Representations (ICLR), 2023.
  20. Steepest descent methods for multicriteria optimization. Mathematical Methods of Operations Research, 51(3):479–494, 2000.
  21. Complexity of gradient descent for multiobjective optimization. Optimization Methods and Software, 34(5):949–959, 2019.
  22. Geoffrion, A. M. Solving bicriterion mathematical programs. Operations Research, 15(1):39–54, 1967.
  23. Goffin, J.-L. On convergence rates of subgradient optimization methods. Mathematical programming, 13:329–347, 1977.
  24. Convexification of a noninferior frontier. Journal of optimization theory and applications, 97:759–768, 1998.
  25. Metabalance: improving multi-task recommendations via adapting gradient magnitudes of auxiliary tasks. In Proceedings of the ACM Web Conference 2022, pp.  2205–2215, 2022.
  26. Hillermeier, C. Generalized homotopy approach to multiobjective optimization. Journal of Optimization Theory and Applications, 110(3):557–583, 2001.
  27. Multi-objective GFlowNets. In International Conference on Machine Learning (ICML), pp.  14631–14653. PMLR, 2023.
  28. Multi-task learning using uncertainty to weigh losses for scene geometry and semantics. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
  29. Adam: A method for stochastic optimization. In International Conference on Learning Representations (ICLR), 2015.
  30. Li, D. Convexification of a noninferior frontier. Journal of Optimization Theory and Applications, 88:177–196, 1996.
  31. Libmtl: A python library for deep multi-task learning. Journal of Machine Learning Research, 24(1-7):18, 2023.
  32. Reasonable effectiveness of random weighting: A litmus test for multi-task learning. Transactions on Machine Learning Research, 2022a.
  33. Dual-balancing for multi-task learning, 2023.
  34. Pareto multi-task learning. In Advances in Neural Information Processing Systems, pp.  12060–12070, 2019.
  35. Controllable pareto multi-task learning. arXiv preprint arXiv:2010.06313, 2020.
  36. Pareto set learning for neural multi-objective combinatorial optimization. In International Conference on Learning Representations (ICLR), 2022b.
  37. Pareto set learning for expensive multiobjective optimization. In Advances in Neural Information Processing Systems (NeurIPS), 2022c.
  38. Conflict-averse gradient descent for multi-task learning. Advances in Neural Information Processing Systems (NeurIPS), 34:18878–18890, 2021a.
  39. Towards impartial multi-task learning. In International Conference on Learning Representations (ICLR), 2021b.
  40. The stochastic multi-gradient algorithm for multi-objective optimization and its application to supervised machine learning. Annals of Operations Research, pp.  1–30, 2021.
  41. End-to-end multi-task learning with attention. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.  1871–1880, 2019.
  42. Auto-lambda: Disentangling dynamic task relationships. Transactions on Machine Learning Research, 2022.
  43. Efficient continuous pareto exploration in multi-task learning. International Conference on Machine Learning (ICML), 2020.
  44. Multi-task learning with user preferences: Gradient descent with controlled ascent in pareto optimization. Thirty-seventh International Conference on Machine Learning, 2020.
  45. Mtadam: Automatic balancing of multiple training loss terms. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp.  10713–10729, 2021.
  46. Attentive single-tasking of multiple tasks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp.  1851–1860, 2019.
  47. Minimax pareto fairness: A multi objective perspective. In International Conference on Machine Learning (ICML), pp.  6755–6764. PMLR, 2020.
  48. Miettinen, K. Nonlinear multiobjective optimization, volume 12. Springer Science & Business Media, 1999.
  49. A multi-objective/multi-task learning framework induced by pareto stationarity. In International Conference on Machine Learning (ICML), pp.  15895–15907. PMLR, 2022.
  50. Learning the pareto front with hypernetworks. In International Conference on Learning Representations (ICLR), 2021.
  51. Multi-task learning as a bargaining game. In International Conference on Machine Learning (ICML), pp.  16428–16446. PMLR, 2022.
  52. Nesterov, Y. Smooth minimization of non-smooth functions. Mathematical Programming, 103:127–152, 2005.
  53. Numerical optimization. Springer, 1999.
  54. Policy gradient approaches for multi-objective sequential deb. In International Joint Conference on Neural Networks, 2014.
  55. Quantum chemistry structures and properties of 134 kilo molecules. Scientific data, 1(1):1–7, 2014.
  56. A swarm metaphor for multiobjective design optimization. Engineering optimization, 34(2):141–153, 2002.
  57. Scalable pareto front approximation for deep multi-objective learning. In IEEE International Conference on Data Mining (ICDM), 2021.
  58. Adapting visual category models to new domains. In Computer Vision–ECCV 2010: 11th European Conference on Computer Vision, Heraklion, Crete, Greece, September 5-11, 2010, Proceedings, Part IV 11, pp.  213–226. Springer, 2010.
  59. Stochastic method for the solution of unconstrained vector optimization problems. Journal of Optimization Theory and Applications, 114:209–222, 2002.
  60. Multi-task learning as multi-objective optimization. In Advances in Neural Information Processing Systems, pp.  525–536, 2018.
  61. Independent component alignment for multi-task learning. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.  20083–20093, 2023.
  62. Indoor segmentation and support inference from rgbd images. In European conference on computer vision, pp.  746–760. Springer, 2012.
  63. An interactive weighted tchebycheff procedure for multiple objective programming. Mathematical programming, 26(3):326–344, 1983.
  64. An easy-to-use real-world multi-objective optimization problem suite. Applied Soft Computing, 89:106078, 2020.
  65. Cfd-based design optimization for single element rocket injector. In 41st Aerospace Sciences Meeting and Exhibit, pp.  296, 2003.
  66. Multi-task learning for dense prediction tasks: A survey. IEEE transactions on pattern analysis and machine intelligence, 44(7):3614–3633, 2021.
  67. Gradient vaccine: Investigating and improving multi-task optimization in massively multilingual models. In International Conference on Learning Representations (ICLR), 2021.
  68. Direction-oriented multi-objective learning: Simple and provable stochastic algorithms. In Advances in Neural Information Processing Systems (NeurIPS), 2023.
  69. Mars: Markov molecular sampling for multi-objective drug discovery. In International Conference on Learning Representations (ICLR), 2021.
  70. Prediction-guided multi-objective reinforcement learning for continuous robot control. In International Conference on Machine Learning, pp.  10607–10616. PMLR, 2020.
  71. Homotopy smoothing for non-smooth problems with lower complexity than o⁢(1/ϵ)𝑜1italic-ϵo(1/\epsilon)italic_o ( 1 / italic_ϵ ). Advances in Neural Information Processing Systems (NeurIPS), 29, 2016.
  72. Gradient surgery for multi-task learning. Advances in Neural Information Processing Systems (NeurIPS), 33:5824–5836, 2020.
  73. MOEA/D: A multiobjective evolutionary algorithm based on decomposition. IEEE Transactions on evolutionary computation, 11(6):712–731, 2007.
  74. Hypervolume maximization: A geometric view of pareto set learning. In Advances in Neural Information Processing Systems (NeurIPS), 2023.
  75. On the convergence of stochastic multi-objective gradient manipulation and beyond. In Advances in Neural Information Processing Systems (NeurIPS), 2022.
  76. The hypervolume indicator revisited: On the design of pareto-compliant indicators via weighted integration. In International Conference on Evolutionary Multi-Criterion Optimization (EMO), 2007.
Citations (7)

Summary

We haven't generated a summary for this paper yet.