Curriculum Learning of Multiple Tasks (1412.1353v1)

Published 3 Dec 2014 in stat.ML and cs.LG

Abstract: Sharing information between multiple tasks enables algorithms to achieve good generalization performance even from small amounts of training data. However, in a realistic scenario of multi-task learning not all tasks are equally related to each other, hence it could be advantageous to transfer information only between the most related tasks. In this work we propose an approach that processes multiple tasks in a sequence with sharing between subsequent tasks instead of solving all tasks jointly. Subsequently, we address the question of curriculum learning of tasks, i.e. finding the best order of tasks to be learned. Our approach is based on a generalization bound criterion for choosing the task order that optimizes the average expected classification performance over all tasks. Our experimental results show that learning multiple related tasks sequentially can be more effective than learning them jointly, the order in which tasks are being solved affects the overall performance, and that our model is able to automatically discover the favourable order of tasks.

Authors (3)

Anastasia Pentina (4 papers)
Viktoriia Sharmanska (19 papers)
Christoph H. Lampert (60 papers)

Citations (234)

View on Semantic Scholar

Summary

Curriculum Learning of Multiple Tasks

The paper "Curriculum Learning of Multiple Tasks" by Anastasia Pentina, Viktoriia Sharmanska, and Christoph H. Lampert addresses a critical challenge in multi-task learning: optimizing the sequence in which tasks are learned to improve overall classification performance. The authors introduce a novel approach wherein tasks are processed sequentially, rather than jointly or independently, enabling better information transfer between related tasks. The paper specifically investigates the influence of task order on learning efficiency and offers a theoretical foundation for optimizing this order.

Central to the paper is the proposition of using a sequential learning framework that resembles curriculum learning, akin to human educational systems where concepts build upon each other. The authors leverage PAC-Bayesian theory to derive a generalization bound, which serves as a criterion for choosing a beneficial task order. This framework assumes linear predictors and uses Euclidean distance between weight vectors to measure task similarity—principally through Adaptive Support Vector Machines (Adaptive SVMs).

The empirical evaluation leverages two datasets: Animals with Attributes (AwA) and Shoes, against which the proposed sequential learning algorithm, SeqMT, is benchmarked against traditional multi-task (MT) and single-task learning approaches. The results demonstrate that SeqMT not only outperforms these baselines when tasks are related but also discovers an advantageous sequence that transcends heuristic or random task orderings. Notably, in scenarios characterized by varying task relationships, the MultiSeqMT extension allows the discovery of subsequences of related tasks, further underscoring the flexibility and effectiveness of their approach.

The experimental results provide compelling evidence that sequence optimization can substantially enhance multi-task learning performance, challenging the homogeneous treatment of tasks prevalent in joint learning approaches. For example, SeqMT exhibited better average error rates than MT and independent SVMs across multiple tasks, and it consistently improved over random task orderings, indicating that even when tasks are not inherently similar, the sequence still significantly impacts learning efficacy.

The theoretical implications of this research extend beyond sequential task learning, connecting concepts from domain adaptation and information transfer in learning algorithms. It suggests a nuanced understanding of task relationships where not all tasks contribute equally when solved in conventional multi-task frameworks. Practically, this methodology offers a way to minimize data requirements by optimizing the learning sequence, which could be remarkably beneficial in domains where data collection and labeling are resource-intensive.

One limitation identified is the need for task processing in sequences, assuming linear task relationships. Future research could explore more complex structures such as trees or graphs, enhancing model robustness against outliers and accommodating non-linear relationships among tasks. This research opens avenues for further exploration into dynamic learning architectures that can adaptively tailor learning pathways based on observed data and task characteristics.

Refinements in understanding task interdependence through learned curricula can drive advancements in AI, particularly in fields like personalized learning systems, adaptive robotics, and scalable automation processes where task priorities are deeply context-dependent. The handling of subsequences of tasks hints at potential scalability improvements, vital for applying multi-task learning solutions to broader and more diverse real-world applications.

PDF Markdown

Curriculum Learning of Multiple Tasks (1412.1353v1)

Summary

Curriculum Learning of Multiple Tasks

Related Papers