Distribution Matching for Heterogeneous Multi-Task Learning: a Large-scale Face Study (2105.03790v1)

Published 8 May 2021 in cs.CV

Abstract: Multi-Task Learning has emerged as a methodology in which multiple tasks are jointly learned by a shared learning algorithm, such as a DNN. MTL is based on the assumption that the tasks under consideration are related; therefore it exploits shared knowledge for improving performance on each individual task. Tasks are generally considered to be homogeneous, i.e., to refer to the same type of problem. Moreover, MTL is usually based on ground truth annotations with full, or partial overlap across tasks. In this work, we deal with heterogeneous MTL, simultaneously addressing detection, classification & regression problems. We explore task-relatedness as a means for co-training, in a weakly-supervised way, tasks that contain little, or even non-overlapping annotations. Task-relatedness is introduced in MTL, either explicitly through prior expert knowledge, or through data-driven studies. We propose a novel distribution matching approach, in which knowledge exchange is enabled between tasks, via matching of their predictions' distributions. Based on this approach, we build FaceBehaviorNet, the first framework for large-scale face analysis, by jointly learning all facial behavior tasks. We develop case studies for: i) continuous affect estimation, action unit detection, basic emotion recognition; ii) attribute detection, face identification. We illustrate that co-training via task relatedness alleviates negative transfer. Since FaceBehaviorNet learns features that encapsulate all aspects of facial behavior, we conduct zero-/few-shot learning to perform tasks beyond the ones that it has been trained for, such as compound emotion recognition. By conducting a very large experimental study, utilizing 10 databases, we illustrate that our approach outperforms, by large margins, the state-of-the-art in all tasks and in all databases, even in these which have not been used in its training.

Authors (3)

Dimitrios Kollias (48 papers)
Viktoriia Sharmanska (19 papers)
Stefanos Zafeiriou (137 papers)

Citations (204)

View on Semantic Scholar

Summary

Distribution Matching for Heterogeneous Multi-Task Learning: A Large-scale Face Study

The paper "Distribution Matching for Heterogeneous Multi-Task Learning: a Large-scale Face Study" by Dimitrios Kollias et al. addresses the crucial problem of heterogeneous multi-task learning (MTL), particularly in the domain of facial behavior analysis. Existing MTL frameworks largely focus on homogeneous tasks with similar data characteristics. This paper, in contrast, investigates heterogeneous tasks, which include regression, classification, and detection objectives, thus necessitating innovative strategies for effective cross-task knowledge sharing.

Overview

This research is conducted on a large scale, primarily focusing on facial behavior recognition tasks, such as affect estimation, facial action unit (AU) detection, basic emotion recognition, attribute detection, and identity recognition. The multiplicity of tasks poses unique challenges related to knowledge transfer, task-relatedness, and avoidance of negative transfer. To address these, the authors introduce a novel distribution matching approach which exploits task-relatedness to facilitate weak supervision when task annotations are limited or non-overlapping.

Methodology

Task-Relatedness: The paper proposes leveraging both domain knowledge and empirical task relationships. Specifically, it incorporates findings from psychological studies about facial expressions and AUs to guide the design of task relationships.
Distribution Matching: By employing knowledge distillation principles, the approach aligns prediction distributions across different tasks, thus fostering mutual learning. This is realized through a distribution matching loss that ensures consistency in predictions across tasks like emotions and AUs.
FaceBehaviorNet: The authors introduce FaceBehaviorNet, a multi-task framework trained on diverse datasets. Crucially, these datasets include the Aff-Wild2, AffectNet, RAF-DB, and others, capturing the full gamut of facial expressions and behavior. The network benefits from the collective annotations and distribution matching, learning robust features adaptable to various tasks.
Weak Supervision and Co-Training: The framework capitalizes on weakly supervised data, extending training with data that lack full task annotations. Co-training is enabled via implicit task coupling through distribution matching and co-annotation-based methods.

Empirical Validation

The model is extensively validated across ten datasets, showing marked improvements over state-of-the-art single-task models and experiment-specific baselines:

Performance Gains: FaceBehaviorNet demonstrates superior performance across a range of metrics, including F1 scores and Concordance Correlation Coefficient (CCC), often outperforming existing methods by significant margins.
Negative Transfer Mitigation: Through task-relatedness and distribution matching, the model effectively curtails negative transfer, a common multi-task learning pitfall where performance in some tasks degrades due to incompatible task relationships.
Generalization: The paper highlights the network's ability to generalize well to tasks not explicitly trained for, indicating robust feature learning. Notably, it achieves impressive results in zero-shot and few-shot learning scenarios in compound emotion recognition.

Implications and Future Directions

This work has substantial implications for advancing multi-task learning by illustrating the efficacy of using distribution matching to navigate the complexities of heterogeneous task learning. Practically, it fosters advancements in human-computer interaction systems, facial recognition applications, and emotion AI. Theoretically, it opens avenues for refining task-sharing strategies in multi-task networks.

Future directions may include exploring deeper integration of temporal learning mechanisms given the facial behavior's dynamic nature, further refining the task-relatedness through more sophisticated models, or expanding this methodology to other domains requiring synergistic learning across diverse tasks.

Overall, this paper is a noteworthy contribution to the AI research community, demonstrating innovation in adapting MTL techniques to complex, real-world applications involving diverse facial behaviors.

PDF Markdown

Related Papers

Find Related Papers