Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
129 tokens/sec
GPT-4o
28 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Fully-adaptive Feature Sharing in Multi-Task Networks with Applications in Person Attribute Classification (1611.05377v1)

Published 16 Nov 2016 in cs.CV and cs.LG

Abstract: Multi-task learning aims to improve generalization performance of multiple prediction tasks by appropriately sharing relevant information across them. In the context of deep neural networks, this idea is often realized by hand-designed network architectures with layers that are shared across tasks and branches that encode task-specific features. However, the space of possible multi-task deep architectures is combinatorially large and often the final architecture is arrived at by manual exploration of this space subject to designer's bias, which can be both error-prone and tedious. In this work, we propose a principled approach for designing compact multi-task deep learning architectures. Our approach starts with a thin network and dynamically widens it in a greedy manner during training using a novel criterion that promotes grouping of similar tasks together. Our Extensive evaluation on person attributes classification tasks involving facial and clothing attributes suggests that the models produced by the proposed method are fast, compact and can closely match or exceed the state-of-the-art accuracy from strong baselines by much more expensive models.

Citations (376)

Summary

  • The paper presents a fully-adaptive feature sharing framework that dynamically adjusts network architecture based on task similarity to improve performance.
  • It employs a top-down widening process with SOMP-based initialization from pre-trained models to mitigate negative transfer risks.
  • Empirical evaluations on CelebA and DeepFashion show models up to 90x smaller and 3x faster in inference while maintaining competitive accuracy.

Overview of Fully-adaptive Feature Sharing in Multi-Task Networks with Applications in Person Attribute Classification

The paper "Fully-adaptive Feature Sharing in Multi-Task Networks with Applications in Person Attribute Classification" presents a significant step towards automating the design of efficient multi-task learning architectures in deep learning, specifically for computer vision tasks concerning person attribute classification. The authors propose a methodology that streamlines the architectural design process by dynamically adjusting network structures based on task similarity, rather than relying on manual, potentially biased exploration.

Key Contributions

The authors introduce a framework for dynamically widening a neural network architecture, beginning with a thin base model and expanding it as dictated by a task similarity criteria throughout the training process. This adaptive approach stands in contrast to traditional, manually-designed architectures and aims to effectively mitigate the risk of negative transfer—a phenomenon where irrelevant task-sharing can hinder performance. Crucially, the proposed methodology addresses the challenge of determining appropriate feature sharing across tasks, balancing it with the complexity of the model.

The framework operates in a top-down manner: it begins with a compact model, initialized using a Simultaneous Orthogonal Matching Pursuit (SOMP) method to efficiently transfer knowledge from a pre-trained model, such as VGG-16, to a thinner network while minimizing initialization error. The network is progressively widened by introducing new branches when deemed appropriate, based on a quantitative measure of task similarity. The decision of when and where to introduce branches in the network is performed using a greedy algorithm that calculates task affinity and adjusts the architecture to separate branches or tasks that show low relatedness. The branching decision is made layer-by-layer, with an emphasis on maintaining a low memory footprint and ensuring latency considerations are met.

Numerical Results and Analysis

The authors conduct extensive evaluations on facial and clothing attribute datasets, namely CelebA and DeepFashion, respectively. The results demonstrate that the automatically configured architecture achieves accuracy comparable to state-of-the-art methods. The proposed architectures offer substantial reductions in both model size—up to 90x more compact than existing models—and computational speed, achieving up to 3x faster prediction times compared to more cumbersome models like the full VGG-16. These results are achieved without significant compromise on accuracy rates, underscoring the practicality and effectiveness of the proposed architecture design algorithm.

Implications and Future Prospects

The implications of this work intersect both theoretical aspects of learning efficiency and practical concerns about scalability in deep learning systems. The capability to craft efficient and compact architectures dynamically paves the way for more sustainable development of neural networks, particularly in resource-constrained environments. Moreover, the methodology could be extended to fields like incremental learning, where models need to adapt to new tasks over time, and domain adaptation, which seeks to transfer knowledge across varying data distributions.

Speculatively, this paper sets a precedent for more automated approaches in architectural design, potentially influencing how models are developed for a broad range of applications beyond mere attribute classification. Future developments could explore deeper integration with evolving techniques in neural architecture search, investigating the harmonization of these constructs with the burgeoning landscape of AI hardware accelerators and cloud-based AI platforms.

Overall, the proposed fully-adaptive feature sharing framework presents a rigorous and systematic method for optimizing the efficiency and efficacy of multi-task networks in the context of person attribute classification. The work illustrates the potential for reducing human bias and resources in architectural design through an empirical, data-driven approach, promising enhanced model performance across varied computer vision tasks.