Papers
Topics
Authors
Recent
Search
2000 character limit reached

Online Mirror Descent for Tchebycheff Scalarization in Multi-Objective Optimization

Published 29 Oct 2024 in cs.LG and cs.AI | (2410.21764v2)

Abstract: The goal of multi-objective optimization (MOO) is to learn under multiple, potentially conflicting, objectives. One widely used technique to tackle MOO is through linear scalarization, where one fixed preference vector is used to combine the objectives into a single scalar value for optimization. However, recent work (Hu et al., 2024) has shown linear scalarization often fails to capture the non-convex regions of the Pareto Front, failing to recover the complete set of Pareto optimal solutions. In light of the above limitations, this paper focuses on Tchebycheff scalarization that optimizes for the worst-case objective. In particular, we propose an online mirror descent algorithm for Tchebycheff scalarization, which we call OMD-TCH. We show that OMD-TCH enjoys a convergence rate of $O(\sqrt{\log m/T})$ where $m$ is the number of objectives and $T$ is the number of iteration rounds. We also propose a novel adaptive online-to-batch conversion scheme that significantly improves the practical performance of OMD-TCH while maintaining the same convergence guarantees. We demonstrate the effectiveness of OMD-TCH and the adaptive conversion scheme on both synthetic problems and federated learning tasks under fairness constraints, showing state-of-the-art performance.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (54)
  1. K. Azuma. Weighted sums of certain dependent random variables. Tohoku Mathematical Journal, Second Series, 19(3):357–367, 1967.
  2. V. J. Bowman Jr. On the relationship of the tchebycheff norm and the efficient frontier of multiple-criteria objectives. In Multiple Criteria Decision Making, 1976.
  3. S. Boyd and L. Vandenberghe. Convex optimization. Cambridge university press, 2004a.
  4. S. Boyd and L. Vandenberghe. Convex optimization. Cambridge university press, 2004b.
  5. R. Caruana. Multitask learning. Machine Learning, 28, 1997.
  6. On the generalization ability of on-line learning algorithms. IEEE Transactions on Information Theory, 50, 2004.
  7. G. Chen and M. Teboulle. Convergence analysis of a proximal-like minimization algorithm using bregman functions. SIAM Journal on Optimization, 3(3):538–543, 1993.
  8. Proper efficiency in nonconvex multicriteria programming. Mathematics of Operations Research, 8, 1983.
  9. FOCUS: Fairness via agent-awareness for federated learning on heterogeneous data. arXiv preprint arXiv:2207.10265, 2022.
  10. Exploiting shared representations for personalized federated learning. In International conference on machine learning. PMLR, 2021.
  11. I. Das and J. Dennis. A closer look at drawbacks of minimizing weighted sums of objectives for pareto set generation in multicriteria optimization problems. Structural Optimization, 14, 1997.
  12. A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Transactions on Evolutionary Computation, 6, 2002.
  13. L. Deng. The mnist database of handwritten digit images for machine learning research [best of the web]. IEEE signal processing magazine, 29, 2012.
  14. J.-A. Désidéri. Multiple-gradient descent algorithm (MGDA) for multiobjective optimization. Comptes Rendus Mathematique, 350, 2012.
  15. M. Ehrgott. Multicriteria optimization. Springer Science & Business Media, 2005.
  16. J. Fliege and B. F. Svaiter. Steepest descent methods for multicriteria optimization. Mathematical Methods of Operations Research, 51, 2000.
  17. A. M. Geoffrion. Solving bicriterion mathematical programs. Operations Research, 15, 1967.
  18. An efficient framework for clustered federated learning. Advances in Neural Information Processing Systems, 33, 2020.
  19. A unifying perspective on multi-calibration: Game dynamics for multi-objective learning. In Advances in Neural Information Processing Systems, 2023.
  20. E. Hazan et al. Introduction to online convex optimization. Foundations and Trends in Optimization, 2016.
  21. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
  22. Robust multi-task learning with excess risks. In Forty-first International Conference on Machine Learning, 2024.
  23. Revisiting scalarization in multi-task learning: A theoretical perspective. Advances in Neural Information Processing Systems, 36, 2024.
  24. Federated learning meets multi-objective optimization. IEEE Transactions on Network Science and Engineering, 2022a.
  25. Federated learning meets multi-objective optimization. IEEE Transactions on Network Science and Engineering, 9, 2022b.
  26. A. Krizhevsky and G. Hinton. Learning multiple layers of features from tiny images. Technical report, University of Toronto, 2009.
  27. Fair resource allocation in federated learning. arXiv preprint arXiv:1905.10497, 2019.
  28. Tilted empirical risk minimization. arXiv preprint arXiv:2007.01162, 2020.
  29. Pareto multi-task learning. In Advances in Neural Information Processing Systems, volume 32, 2019.
  30. Smooth tchebycheff scalarization for multi-objective optimization. In 41st International Conference on Machine Learning, 2024.
  31. Conflict-averse gradient descent for multi-task learning. Advances in Neural Information Processing Systems, 34:18878–18890, 2021.
  32. D. Mahapatra and V. Rajan. Multi-task learning with user preferences: Gradient descent with controlled ascent in pareto optimization. In Proceedings of the 37th International Conference on Machine Learning, 2020.
  33. Communication-efficient learning of deep networks from decentralized data. In Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, 2017a.
  34. Communication-efficient learning of deep networks from decentralized data. In Artificial intelligence and statistics. PMLR, 2017b.
  35. Balancing average and worst-case accuracy in multitask learning. arXiv:2110.05838, 2021.
  36. K. Miettinen. Nonlinear multiobjective optimization. International Series in Operations Research & Management Science, 12, 1998.
  37. Agnostic federated learning. In Proceedings of the 36th International Conference on Machine Learning, 2019.
  38. H. Namkoong and J. C. Duchi. Stochastic gradient methods for distributionally robust optimization with f-divergences. In Advances in Neural Information Processing Systems, 2016.
  39. Multi-task learning as a bargaining game. In Proceedings of the 39th International Conference on Machine Learning, 2022.
  40. Problem complexity and method efficiency in optimization. Wiley-Interscience, 1983.
  41. Distributionally robust neural networks. In International Conference on Learning Representations, 2020.
  42. J. D. Schaffer. Multiple objective optimization with vector evaluated genetic algorithms. In Proceedings of the First International Conference on Genetic Algortihms, 1985.
  43. O. Sener and V. Koltun. Multi-task learning as multi-objective optimization. In Advances in Neural Information Processing Systems, volume 31, 2018.
  44. Independent component alignment for multi-task learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023.
  45. Multiobjective evolutionary algorithm test suites. In Proceedings of the 1999 ACM symposium on Applied computing, 1999.
  46. Federated learning with fair averaging. arXiv preprint arXiv:2104.14937, 2021.
  47. Mars: Markov molecular sampling for multi-objective drug discovery. In International Conference on Learning Representations, 2021.
  48. Gradient surgery for multi-task learning. Advances in Neural Information Processing Systems, 33:5824–5836, 2020.
  49. Proportional fairness in federated learning. arXiv preprint arXiv:2202.01666, 2022.
  50. Q. Zhang and H. Li. MOEA/D: A multiobjective evolutionary algorithm based on decomposition. IEEE Transactions on Evolutionary Computation, 11, 2007.
  51. PMGDA: A preference-based multiple gradient descent algorithm. IEEE Transactions on Emerging Topics in Computational Intelligence, 2024a.
  52. Libmoon: A gradient-based multiobjective optimization library in pytorch. Advances in Neural Information Processing Systems, 2024b.
  53. Optimal multi-distribution learning. In The Thirty Seventh Annual Conference on Learning Theory, 2024c.
  54. M. Zinkevich. Online convex programming and generalized infinitesimal gradient ascent. In Proceedings of the 20th international conference on machine learning (icml-03), 2003.

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 0 likes about this paper.