Clustered Multi-Task Learning: A Convex Formulation (0809.2085v1)

Published 11 Sep 2008 in cs.LG

Abstract: In multi-task learning several related tasks are considered simultaneously, with the hope that by an appropriate sharing of information across tasks, each task may benefit from the others. In the context of learning linear functions for supervised classification or regression, this can be achieved by including a priori information about the weight vectors associated with the tasks, and how they are expected to be related to each other. In this paper, we assume that tasks are clustered into groups, which are unknown beforehand, and that tasks within a group have similar weight vectors. We design a new spectral norm that encodes this a priori assumption, without the prior knowledge of the partition of tasks into groups, resulting in a new convex optimization formulation for multi-task learning. We show in simulations on synthetic examples and on the IEDB MHC-I binding dataset, that our approach outperforms well-known convex methods for multi-task learning, as well as related non convex methods dedicated to the same problem.

Authors (3)

Laurent Jacob (11 papers)
Francis Bach (249 papers)
Jean-Philippe Vert (41 papers)

Citations (505)

View on Semantic Scholar

Summary

Clustered Multi-Task Learning: A Convex Formulation

The paper introduces a novel approach to multi-task learning (MTL) by addressing the problem of clustering tasks into groups, assuming that tasks within the same group share similar weight vectors. This approach dispenses with the prior knowledge of task groupings and proposes a convex formulation that optimizes both the identification of clusters and the learning task. The methodology is applied to the learning of linear functions in supervised tasks.

Regularization and Multi-Task Learning

Regularization is foundational in machine learning, especially for managing high-dimensional data through norms such as Euclidean and Hilbert norms. The $\ell^1$ -norm, promoting sparsity, is notable here. This paper positions itself within this context, innovating by designing a spectral norm that presupposes task clustering, optimizing without explicit group knowledge.

Convex Optimization in Multi-Task Learning

The paper's core contribution lies in the convex optimization framework to identify hidden clusters within tasks, thereby improving MTL. The method treats clusters as unseen variables, allowing simultaneous cluster identification and weight learning. This differs from previous works that assume known clusters or enforce global task similarity.

Methodology and Mathematical Formulation

The authors formulate the problem using a penalty function that combines task clustering priors with regularization. The clustering hypothesis is embedded via three variance measures: global, between-cluster, and within-cluster. These measures guide the formulation of an optimization problem where the primary aim is to minimize empirical risk augmented by these clustering penalties. The convex relaxation of the problem enables feasible computation by allowing matrix constraints to approximate discrete clustering.

Simulations and Empirical Results

The algorithm is tested on synthetic datasets and the iedb MHC-I binding dataset. Results indicate superior performance compared to traditional convex and non-convex MTL methods. Notably, the method provides structured learning without preliminary knowledge of task affinity, identifying task clusters as a by-product. Tests on synthetic data demonstrate the advantage of this formulation over trace norm and Frobenius norm penalizations, particularly with limited training data.

Implications and Future Directions

This research extends MTL by incorporating latent clustering, which is critical in settings where tasks naturally partition into unknown groups. Practically, this benefits scenarios with variable task relationships and outliers, offering a more tailored approach to task regularization. The paper suggests potential for further studies in non-linear multi-task adaptations and integration of task-specific features, which could expand the utility of this approach in complex real-world applications.

The proposed methodology represents an evolution in the MTL landscape, engaging with task variability in a more nuanced manner without imposing prior clustering assumptions. Future advancements could refine the convex relaxation techniques, potentially increasing precision and computational efficiency in larger and more diverse datasets.

PDF Markdown