Multi-Task Learning with Deep Neural Networks: A Survey (2009.09796v1)

Published 10 Sep 2020 in cs.LG, cs.CV, and stat.ML

Abstract: Multi-task learning (MTL) is a subfield of machine learning in which multiple tasks are simultaneously learned by a shared model. Such approaches offer advantages like improved data efficiency, reduced overfitting through shared representations, and fast learning by leveraging auxiliary information. However, the simultaneous learning of multiple tasks presents new design and optimization challenges, and choosing which tasks should be learned jointly is in itself a non-trivial problem. In this survey, we give an overview of multi-task learning methods for deep neural networks, with the aim of summarizing both the well-established and most recent directions within the field. Our discussion is structured according to a partition of the existing deep MTL techniques into three groups: architectures, optimization methods, and task relationship learning. We also provide a summary of common multi-task benchmarks.

PDF Abstract

Multi-Task Learning with Deep Neural Networks: A Survey

This paper presents a thorough survey on multi-task learning (MTL) methods using deep neural networks. It outlines the advantages of MTL, such as improved data efficiency and reduced overfitting, while addressing the complexities introduced by diverse tasks. The central objective of the paper is to categorize and evaluate the existing strategies employed in deep multi-task learning, summarizing both foundational and recent advancements in the field.

Overview of Multi-Task Learning Approaches

The paper classifies multi-task learning methodologies into three primary categories: architectures, optimization methods, and task relationship learning.

Architectures:
- The paper explores various architectural designs for multi-task learning, such as shared trunk models, modular policies, and conditional computation. These designs balance parameter sharing and task-specific customization to maximize learning efficiency while mitigating negative transfer.
- The survey highlights the development of shared architectures across specific domains like computer vision, natural language processing, and reinforcement learning. For instance, architectures employing cross-talk methods and modular networks exhibit strategies for parameter partitioning and sharing to boost individual task performance.
Optimization Techniques:
- The discussion expands to optimization techniques that adjust training dynamics, such as loss weighting strategies, regularization, and gradient modulation. Methods like adversarial training and Pareto optimization emerge as notable strategies to resolve task conflicts during joint learning.
- Several approaches focus on balancing task-specific losses automatically based on factors like uncertainty and learning speed, aiming to preserve the multi-task model's robustness across diverse tasks.
Task Relationship Learning:
- The survey also considers task relationship learning methods, which seek to understand and exploit the relationships between tasks by learning task embeddings or clustering similar tasks for joint training. These methods aim to enhance task synergies and predictive insights drawn from inter-task dependencies.

Benchmarks and Empirical Evaluations

The paper includes an overview of datasets and benchmarks commonly used across different domains for evaluating multi-task learning models: NYU-v2 for computer vision, OntoNotes for NLP, and Meta-World for reinforcement learning, among others. These benchmarks provide arenas to empirically compare and validate the capabilities of MTL models across multiple, potentially interacting tasks.

Implications and Future Directions

The implications of the highlighted methods suggest potential avenues for developing more robust learning systems that can effectively transfer knowledge across tasks, much like human cognitive capabilities. This survey lays the groundwork for future research, positing that advancements in architecture synthesis, optimization strategies, and deep theoretical foundations are crucial in evolving practical and adaptive MTL systems.

Moreover, the paper calls attention to the relative neglect of theoretical underpinnings in MTL. Addressing this gap could provide a more comprehensive understanding of the dynamics and success conditions of multi-task learning, potentially leading to new methodologies with better generalization and adaptability.

In conclusion, the survey serves as a pivotal resource that organizes existing knowledge and identifies technical gaps within multi-task learning, aiming to guide researchers towards developing more efficient, scalable, and human-like learning systems.

PDF Markdown Bookmark Chat (Pro)

Authors (1)

Michael Crawshaw (9 papers)

Citations (546)

View on Semantic Scholar

Multi-Task Learning with Deep Neural Networks: A Survey (2009.09796v1)