Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

AnyEdit: Mastering Unified High-Quality Image Editing for Any Idea (2411.15738v2)

Published 24 Nov 2024 in cs.CV

Abstract: Instruction-based image editing aims to modify specific image elements with natural language instructions. However, current models in this domain often struggle to accurately execute complex user instructions, as they are trained on low-quality data with limited editing types. We present AnyEdit, a comprehensive multi-modal instruction editing dataset, comprising 2.5 million high-quality editing pairs spanning over 20 editing types and five domains. We ensure the diversity and quality of the AnyEdit collection through three aspects: initial data diversity, adaptive editing process, and automated selection of editing results. Using the dataset, we further train a novel AnyEdit Stable Diffusion with task-aware routing and learnable task embedding for unified image editing. Comprehensive experiments on three benchmark datasets show that AnyEdit consistently boosts the performance of diffusion-based editing models. This presents prospects for developing instruction-driven image editing models that support human creativity.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (10)
  1. Qifan Yu (14 papers)
  2. Wei Chow (11 papers)
  3. Zhongqi Yue (17 papers)
  4. Kaihang Pan (17 papers)
  5. Yang Wu (175 papers)
  6. Xiaoyang Wan (1 paper)
  7. Juncheng Li (121 papers)
  8. Siliang Tang (116 papers)
  9. Hanwang Zhang (161 papers)
  10. Yueting Zhuang (164 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.