Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

CustAny: Customizing Anything from A Single Example (2406.11643v4)

Published 17 Jun 2024 in cs.CV

Abstract: Recent advances in diffusion-based text-to-image models have simplified creating high-fidelity images, but preserving the identity (ID) of specific elements, like a personal dog, is still challenging. Object customization, using reference images and textual descriptions, is key to addressing this issue. Current object customization methods are either object-specific, requiring extensive fine-tuning, or object-agnostic, offering zero-shot customization but limited to specialized domains. The primary issue of promoting zero-shot object customization from specific domains to the general domain is to establish a large-scale general ID dataset for model pre-training, which is time-consuming and labor-intensive. In this paper, we propose a novel pipeline to construct a large dataset of general objects and build the Multi-Category ID-Consistent (MC-IDC) dataset, featuring 315k text-image samples across 10k categories. With the help of MC-IDC, we introduce Customizing Anything (CustAny), a zero-shot framework that maintains ID fidelity and supports flexible text editing for general objects. CustAny features three key components: a general ID extraction module, a dual-level ID injection module, and an ID-aware decoupling module, allowing it to customize any object from a single reference image and text prompt. Experiments demonstrate that CustAny outperforms existing methods in both general object customization and specialized domains like human customization and virtual try-on. Our contributions include a large-scale dataset, the CustAny framework and novel ID processing to advance this field. Code and dataset will be released soon in https://github.com/LingjieKong-fdu/CustAny.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (11)
  1. Lingjie Kong (12 papers)
  2. Kai Wu (134 papers)
  3. Xiaobin Hu (42 papers)
  4. Wenhui Han (2 papers)
  5. Jinlong Peng (34 papers)
  6. Chengming Xu (26 papers)
  7. Donghao Luo (34 papers)
  8. Jiangning Zhang (102 papers)
  9. Chengjie Wang (178 papers)
  10. Yanwei Fu (199 papers)
  11. Mengtian Li (31 papers)
Citations (1)