Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

X-Learner: Learning Cross Sources and Tasks for Universal Visual Representation (2203.08764v1)

Published 16 Mar 2022 in cs.CV and cs.AI

Abstract: In computer vision, pre-training models based on largescale supervised learning have been proven effective over the past few years. However, existing works mostly focus on learning from individual task with single data source (e.g., ImageNet for classification or COCO for detection). This restricted form limits their generalizability and usability due to the lack of vast semantic information from various tasks and data sources. Here, we demonstrate that jointly learning from heterogeneous tasks and multiple data sources contributes to universal visual representation, leading to better transferring results of various downstream tasks. Thus, learning how to bridge the gaps among different tasks and data sources is the key, but it still remains an open question. In this work, we propose a representation learning framework called X-Learner, which learns the universal feature of multiple vision tasks supervised by various sources, with expansion and squeeze stage: 1) Expansion Stage: X-Learner learns the task-specific feature to alleviate task interference and enrich the representation by reconciliation layer. 2) Squeeze Stage: X-Learner condenses the model to a reasonable size and learns the universal and generalizable representation for various tasks transferring. Extensive experiments demonstrate that X-Learner achieves strong performance on different tasks without extra annotations, modalities and computational costs compared to existing representation learning methods. Notably, a single X-Learner model shows remarkable gains of 3.0%, 3.3% and 1.8% over current pretrained models on 12 downstream datasets for classification, object detection and semantic segmentation.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (10)
  1. Yinan He (34 papers)
  2. Gengshi Huang (3 papers)
  3. Siyu Chen (105 papers)
  4. Jianing Teng (4 papers)
  5. Wang Kun (3 papers)
  6. Zhenfei Yin (41 papers)
  7. Lu Sheng (63 papers)
  8. Ziwei Liu (368 papers)
  9. Yu Qiao (563 papers)
  10. Jing Shao (109 papers)
Citations (5)

Summary

We haven't generated a summary for this paper yet.