Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Free-ATM: Exploring Unsupervised Learning on Diffusion-Generated Images with Free Attention Masks (2308.06739v1)

Published 13 Aug 2023 in cs.CV

Abstract: Despite the rapid advancement of unsupervised learning in visual representation, it requires training on large-scale datasets that demand costly data collection, and pose additional challenges due to concerns regarding data privacy. Recently, synthetic images generated by text-to-image diffusion models, have shown great potential for benefiting image recognition. Although promising, there has been inadequate exploration dedicated to unsupervised learning on diffusion-generated images. To address this, we start by uncovering that diffusion models' cross-attention layers inherently provide annotation-free attention masks aligned with corresponding text inputs on generated images. We then investigate the problems of three prevalent unsupervised learning techniques ( i.e., contrastive learning, masked modeling, and vision-language pretraining) and introduce customized solutions by fully exploiting the aforementioned free attention masks. Our approach is validated through extensive experiments that show consistent improvements in baseline models across various downstream tasks, including image classification, detection, segmentation, and image-text retrieval. By utilizing our method, it is possible to close the performance gap between unsupervised pretraining on synthetic data and real-world scenarios.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. David Junhao Zhang (19 papers)
  2. Mutian Xu (12 papers)
  3. Chuhui Xue (19 papers)
  4. Wenqing Zhang (60 papers)
  5. Xiaoguang Han (118 papers)
  6. Song Bai (87 papers)
  7. Mike Zheng Shou (165 papers)
Citations (6)

Summary

We haven't generated a summary for this paper yet.