Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

STG-MTL: Scalable Task Grouping for Multi-Task Learning Using Data Map (2307.03374v2)

Published 7 Jul 2023 in cs.LG

Abstract: Multi-Task Learning (MTL) is a powerful technique that has gained popularity due to its performance improvement over traditional Single-Task Learning (STL). However, MTL is often challenging because there is an exponential number of possible task groupings, which can make it difficult to choose the best one because some groupings might produce performance degradation due to negative interference between tasks. That is why existing solutions are severely suffering from scalability issues, limiting any practical application. In our paper, we propose a new data-driven method that addresses these challenges and provides a scalable and modular solution for classification task grouping based on a re-proposed data-driven features, Data Maps, which capture the training dynamics for each classification task during the MTL training. Through a theoretical comparison with other techniques, we manage to show that our approach has the superior scalability. Our experiments show a better performance and verify the method's effectiveness, even on an unprecedented number of tasks (up to 100 tasks on CIFAR100). Being the first to work on such number of tasks, our comparisons on the resulting grouping shows similar grouping to the mentioned in the dataset, CIFAR100. Finally, we provide a modular implementation for easier integration and testing, with examples from multiple datasets and tasks.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (37)
  1. Ext5: Towards extreme multi-task scaling for transfer learning. arXiv preprint arXiv:2111.10952, 2021.
  2. Covid-mtl: Multitask learning with shift3d and random-weighted loss for covid-19 diagnosis and severity assessment. Pattern Recognition, 124:108499, 2022.
  3. Fcm: The fuzzy c-means clustering algorithm. Computers & geosciences, 10(2-3):191–203, 1984.
  4. Multi-task learning for hiv therapy screening. In Proceedings of the 25th international conference on Machine learning, pp.  56–63, 2008.
  5. Caruana, R. Multitask learning. Machine learning, 28:41–75, 1997.
  6. Mod-squad: Designing mixtures of experts as modular multi-task learners. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  11828–11837, 2023.
  7. Crawshaw, M. Multi-task learning with deep neural networks: A survey. arXiv preprint arXiv:2009.09796, 2020.
  8. Hd-mtl: Hierarchical deep multi-task learning for large-scale visual recognition. IEEE transactions on image processing, 26(4):1923–1938, 2017.
  9. Efficiently identifying task groupings for multi-task learning. Advances in Neural Information Processing Systems, 34:27503–27516, 2021.
  10. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp.  770–778, 2016.
  11. Curriculum-based asymmetric multi-task reinforcement learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022a.
  12. Gnas: A greedy neural architecture search method for multi-attribute learning. In Proceedings of the 26th ACM international conference on Multimedia, pp.  2049–2057, 2018.
  13. Mtl-slt: multi-task learning for spoken language tasks. In Proceedings of the 4th Workshop on NLP for Conversational AI, pp.  120–130, 2022b.
  14. St-mtl: Spatio-temporal multitask learning model to predict scanpath while tracking instruments in robotic surgery. Medical Image Analysis, 67:101837, 2021.
  15. Multi-task learning model based on multi-scale cnn and lstm for sentiment classification. IEEE Access, 8:77060–77072, 2020.
  16. Learning multiple layers of features from tiny images. 2009.
  17. Asymmetric multi-task learning based on task relatedness and loss. In International conference on machine learning, pp. 230–238. PMLR, 2016.
  18. Deep asymmetric multi-task feature learning. In International Conference on Machine Learning, pp. 2956–2964. PMLR, 2018.
  19. Progressive neural architecture search. In Proceedings of the European conference on computer vision (ECCV), pp.  19–34, 2018.
  20. Lloyd, S. Least squares quantization in pcm. IEEE transactions on information theory, 28(2):129–137, 1982.
  21. Learning multiple tasks with multilinear relationship networks. Advances in neural information processing systems, 30, 2017.
  22. Attentive single-tasking of multiple tasks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  1851–1860, 2019.
  23. Cross-stitch networks for multi-task learning. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp.  3994–4003, 2016.
  24. An empirical study of multi-task learning on bert for biomedical text mining. arXiv preprint arXiv:2005.02799, 2020.
  25. Latent multi-task architecture learning. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, pp.  4822–4829, 2019.
  26. Multi-task learning as multi-objective optimization. Advances in neural information processing systems, 31, 2018.
  27. Efficient and effective multi-task grouping via meta learning on task combinations. In Advances in Neural Information Processing Systems, 2022.
  28. Which tasks should be learned together in multi-task learning? In International Conference on Machine Learning, pp. 9120–9132. PMLR, 2020.
  29. Learning task relatedness in multi-task learning for images in context. In Proceedings of the 2019 on international conference on multimedia retrieval, pp.  78–86, 2019.
  30. Adashare: Learning what to share for efficient deep multi-task learning. In Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M., and Lin, H. (eds.), Advances in Neural Information Processing Systems, volume 33, pp.  8728–8740. Curran Associates, Inc., 2020. URL https://proceedings.neurips.cc/paper_files/paper/2020/file/634841a6831464b64c072c8510c7f35c-Paper.pdf.
  31. Dataset cartography: Mapping and diagnosing datasets with training dynamics. arXiv preprint arXiv:2009.10795, 2020.
  32. Branched multi-task networks: Deciding what layers to share. CoRR, abs/1904.02920, 2019. URL http://arxiv.org/abs/1904.02920.
  33. Understanding and improving information transfer in multi-task learning. arXiv preprint arXiv:2005.00944, 2020.
  34. A tree-structured multi-task model recommender. In First Conference on Automated Machine Learning (Main Track), 2022a. URL https://openreview.net/forum?id=BEl4CgaHLlc.
  35. A survey on multi-task learning. IEEE Transactions on Knowledge and Data Engineering, 34(12):5586–5609, 2022. doi: 10.1109/TKDE.2021.3070203.
  36. Attention-augmented end-to-end multi-task learning for emotion prediction from speech. In ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.  6705–6709. IEEE, 2019.
  37. A survey of multi-task learning in natural language processing: Regarding task relatedness and training methods. arXiv preprint arXiv:2204.03508, 2022b.
Citations (3)

Summary

We haven't generated a summary for this paper yet.

Youtube Logo Streamline Icon: https://streamlinehq.com