Fair Resource Allocation in Multi-Task Learning (2402.15638v2)
Abstract: By jointly learning multiple tasks, multi-task learning (MTL) can leverage the shared knowledge across tasks, resulting in improved data efficiency and generalization performance. However, a major challenge in MTL lies in the presence of conflicting gradients, which can hinder the fair optimization of some tasks and subsequently impede MTL's ability to achieve better overall performance. Inspired by fair resource allocation in communication networks, we formulate the optimization of MTL as a utility maximization problem, where the loss decreases across tasks are maximized under different fairness measurements. To solve this problem, we propose FairGrad, a novel MTL optimization method. FairGrad not only enables flexible emphasis on certain tasks but also achieves a theoretical convergence guarantee. Extensive experiments demonstrate that our method can achieve state-of-the-art performance among gradient manipulation methods on a suite of multi-task benchmarks in supervised learning and reinforcement learning. Furthermore, we incorporate the idea of $\alpha$-fairness into loss functions of various MTL methods. Extensive empirical studies demonstrate that their performance can be significantly enhanced. Code is provided at \url{https://github.com/OptMN-Lab/fairgrad}.
- Rich Caruana. Multitask learning. Machine learning, 28:41–75, 1997.
- Regularized multi–task learning. In Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 109–117, 2004.
- A brief review on multi-task learning. Multimedia Tools and Applications, 77:29705–29725, 2018.
- Recurrent neural network for text classification with multi-task learning. arXiv preprint arXiv:1605.05101, 2016.
- A survey of multi-task learning in natural language processing: Regarding task relatedness and training methods. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, pages 943–956, 2023.
- Language models are unsupervised multitask learners. OpenAI Blog, 1(8):9, 2019.
- Facial landmark detection by deep multi-task learning. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part VI 13, pages 94–108. Springer, 2014.
- Instance-aware semantic segmentation via multi-task network cascades. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 3150–3158, 2016.
- Multi-task learning for dense prediction tasks: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(7):3614–3633, 2021.
- Multi-task learning for dangerous object detection in autonomous driving. Information Sciences, 432:559–571, 2018.
- Multi-task learning with attention for end-to-end autonomous driving. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2902–2911, 2021.
- Bdd100k: A diverse driving dataset for heterogeneous multitask learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2636–2645, 2020.
- Ask the gru: Multi-task learning for deep text recommendations. In proceedings of the 10th ACM Conference on Recommender Systems, pages 107–114, 2016.
- Improving multi-scenario learning to rank in e-commerce by exploiting task relationships in the label space. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management, pages 2605–2612, 2020.
- M2grl: A multi-task multi-view graph representation learning framework for web-scale recommender systems. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 2349–2358, 2020.
- Taking advantage of sparsity in multi-task learning. arXiv preprint arXiv:0903.1468, 2009.
- Yu Zhang and Qiang Yang. A survey on multi-task learning. IEEE Transactions on Knowledge and Data Engineering, 34(12):5586–5609, 2021.
- Sebastian Ruder. An overview of multi-task learning in deep neural networks. arXiv preprint arXiv:1706.05098, 2017.
- Algorithm-dependent generalization bounds for multi-task learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(2):227–241, 2016.
- Gradient surgery for multi-task learning. Advances in Neural Information Processing Systems, 33:5824–5836, 2020.
- Conflict-averse gradient descent for multi-task learning. Advances in Neural Information Processing Systems, 34:18878–18890, 2021.
- Gradient vaccine: Investigating and improving multi-task optimization in massively multilingual models. arXiv preprint arXiv:2010.05874, 2020.
- Multi-task learning as multi-objective optimization. Advances in Neural Information Processing Systems, 31, 2018.
- Jean-Antoine Désidéri. Multiple-gradient descent algorithm (mgda) for multiobjective optimization. Comptes Rendus Mathematique, 350(5-6):313–318, 2012.
- Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In International Conference on Machine Learning, pages 794–803. PMLR, 2018.
- Multi-task learning as a bargaining game. In International Conference on Machine Learning, pages 16428–16446. PMLR, 2022.
- Direction-oriented multi-objective learning: Simple and provable stochastic algorithms. arXiv preprint arXiv:2305.18409, 2023.
- Famo: Fast adaptive multitask optimization. arXiv preprint arXiv:2306.03792, 2023.
- Mitigating gradient bias in multi-objective learning: A provably convergent approach. In The Eleventh International Conference on Learning Representations, 2022.
- A quantitative measure of fairness and discrimination. Eastern Research Laboratory, Digital Equipment Corporation, Hudson, MA, 21, 1984.
- Frank Kelly. Charging and rate control for elastic traffic. European Transactions on Telecommunications, 8(1):33–37, 1997.
- Fair end-to-end window-based congestion control. IEEE/ACM Transactions on Networking, 8(5):556–567, 2000.
- A unified framework for max-min and min-max fairness with applications. IEEE/ACM Transactions on Networking, 15(5):1073–1083, 2007.
- Communication networks: an optimization, control, and stochastic networks perspective. Cambridge University Press, 2013.
- Optimal resource allocation in full-duplex wireless-powered communication network. IEEE Transactions on Communications, 62(10):3528–3540, 2014.
- Optimal resource allocation in complex communication networks. IEEE Transactions on Circuits and Systems II: Express Briefs, 62(7):706–710, 2015.
- End-to-end multi-task learning with attention. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1871–1880, 2019.
- Iasonas Kokkinos. Ubernet: Training a universal convolutional neural network for low-, mid-, and high-level vision using diverse datasets and limited memory. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 6129–6138, 2017.
- Latent multi-task architecture learning. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, pages 4822–4829, 2019.
- Mtl-nas: Task-agnostic neural architecture search towards general-purpose multi-task learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11543–11552, 2020.
- Efficient multitask feature and relationship learning. In Uncertainty in Artificial Intelligence, pages 777–787. PMLR, 2020.
- Consistent multitask learning with nonlinear output relations. Advances in Neural Information Processing Systems, 30, 2017.
- Multi-task learning using uncertainty to weigh losses for scene geometry and semantics. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 7482–7491, 2018.
- Revisiting scalarization in multi-task learning: A theoretical perspective. arXiv preprint arXiv:2308.13985, 2023.
- On the convergence of stochastic multi-objective gradient manipulation and beyond. Advances in Neural Information Processing Systems, 35:38103–38115, 2022.
- Three-way trade-off in multi-objective learning: Optimization, generalization and conflict-avoidance. arXiv preprint arXiv:2305.20057, 2023.
- Achieving mac layer fairness in wireless packet networks. In Proceedings of the 6th Annual International Conference on Mobile Computing and Networking, pages 87–98, 2000.
- Joint congestion control, routing, and mac for stability and fairness in wireless networks. IEEE Journal on Selected Areas in Communications, 24(8):1514–1524, 2006.
- An axiomatic theory of fairness in network resource allocation. IEEE, 2010.
- Fairness in wireless networks: Issues, measures and challenges. IEEE Communications Surveys & Tutorials, 16(1):5–24, 2013.
- A survey on resource allocation in vehicular networks. IEEE Transactions on Intelligent Transportation Systems, 23(2):701–721, 2020.
- A survey on resource allocation for 5g heterogeneous networks: Current research, future trends, and challenges. IEEE Communications Surveys & Tutorials, 23(2):668–695, 2021.
- Fair resource allocation in federated learning. In International Conference on Learning Representations, 2019.
- Proportional fairness in federated learning. Transactions on Machine Learning Research, 2022.
- Deep learning face attributes in the wild. In Proceedings of International Conference on Computer Vision (ICCV), December 2015.
- Quantum chemistry structures and properties of 134 kilo molecules. Scientific Data, 1, 2014.
- Fast graph representation learning with pytorch geometric. arXiv preprint arXiv:1903.02428, 2019.
- Indoor segmentation and support inference from rgbd images. In Computer Vision–ECCV 2012: 12th European Conference on Computer Vision, Florence, Italy, October 7-13, 2012, Proceedings, Part V 12, pages 746–760. Springer, 2012.
- The cityscapes dataset for semantic urban scene understanding. In Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
- Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(12):2481–2495, 2017.
- Reasonable effectiveness of random weighting: A litmus test for multi-task learning. arXiv preprint arXiv:2111.10603, 2021.
- Towards impartial multi-task learning. In International Conference on Learning Representations, 2020.
- Just pick a sign: Optimizing deep multitask models with gradient sign dropout. Advances in Neural Information Processing Systems, 33:2039–2050, 2020.
- Meta-world: A benchmark and evaluation for multi-task and meta reinforcement learning. In Conference on Robot Learning, pages 1094–1100. PMLR, 2020.
- Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. In International Conference on Machine Learning, pages 1861–1870. PMLR, 2018.
- Amy Zhang Shagun Sodhani. Mtrl - multi task rl algorithms. Github, 2021.