Wear-Any-Way: Manipulable Virtual Try-on via Sparse Correspondence Alignment

Published 19 Mar 2024 in cs.CV | (2403.12965v1)

Abstract: This paper introduces a novel framework for virtual try-on, termed Wear-Any-Way. Different from previous methods, Wear-Any-Way is a customizable solution. Besides generating high-fidelity results, our method supports users to precisely manipulate the wearing style. To achieve this goal, we first construct a strong pipeline for standard virtual try-on, supporting single/multiple garment try-on and model-to-model settings in complicated scenarios. To make it manipulable, we propose sparse correspondence alignment which involves point-based control to guide the generation for specific locations. With this design, Wear-Any-Way gets state-of-the-art performance for the standard setting and provides a novel interaction form for customizing the wearing style. For instance, it supports users to drag the sleeve to make it rolled up, drag the coat to make it open, and utilize clicks to control the style of tuck, etc. Wear-Any-Way enables more liberated and flexible expressions of the attires, holding profound implications in the fashion industry.

Abstract PDF HTML Upgrade to Chat

Authors (7)

References (3)

Citations (16)

View on Semantic Scholar

Summary

The paper introduces a dual-branch pipeline using a pre-trained Stable Diffusion model and sparse correspondence alignment for precise garment manipulation.
It employs point-based control mechanisms, including condition dropping and point-weighted loss, to enhance customization in complex virtual try-on scenarios.
It outperforms state-of-the-art methods in rendering fidelity and manipulability, paving the way for advanced research in dynamic virtual try-on systems.

Manipulable Virtual Try-on Enhanced by Sparse Correspondence Alignment

Introduction

The field of virtual try-on has steadily gained traction, driven by its utility in the fashion industry and its role in enhancing online shopping experiences. Traditional methods relied heavily on generative adversarial networks (GANs) but often fell short in complex scenarios or when intricate control over garment presentation was desired. The diffusion model advancements introduced a leap in generation quality yet lacked the nuanced control over wearing style crucial for a truly customizable try-on experience. Addressing these gaps, the novel framework "Wear-Any-Way" emerges. It not only accomplishes high-fidelity garment rendering in complex settings but also introduces a revolutionary interaction mode allowing users to manipulate garment wearing styles through simple, intuitive controls.

Methodology Overview

The foundation of Wear-Any-Way is a dual-branch pipeline that leverages a pre-trained inpainting Stable Diffusion model for generating the tried-on images. It innovatively incorporates point-based control through sparse correspondence alignment, enabling precise manipulation of garment position and style on the model. This system supports various virtual try-on challenges, including model-to-model try-on and multi-garment scenarios, in intricate real-world settings.

Sparse Correspondence Alignment

The core innovation lies in the sparse correspondence alignment mechanism that facilitates point-based control, guiding the garment's specific position on the person image. This mechanism learns a series of permutable point embeddings which are then injected into both the main and referential U-Nets. With strategies like condition dropping, zero-initialization, and point-weighted loss, this approach strengthens the model's learning and control precision.

Training Points Collection

A notable challenge addressed is the collection of accurately matched training point pairs between garment and person images. Utilizing pre-trained diffusion models' inherent matching capabilities, a robust pipeline was established to collect paired points efficiently. These pairs are essential for training the model to understand and enact point-based controls accurately.

Evaluation and Comparisons

Wear-Any-Way has undergone rigorous benchmarking against existing state-of-the-art virtual try-on methods and interactive image editing techniques. It demonstrated superior performance in both standard virtual try-on settings and controllability metrics, emphasizing its unprecedented detail fidelity and manipulation flexibility.

Implications and Future Directions

Wear-Any-Way stands as a significant advancement in the virtual try-on arena, bringing forth highly detailed and customizable garment rendering. Its introduction of manipulable wearing styles opens a plethora of possibilities for the fashion industry, especially in e-commerce, where user interaction and satisfaction are paramount.

The framework also sets a solid foundation for future explorations in this domain. Potential research directions could involve enhancing control granularity, extending manipulation capabilities to cover a wider array of garment types, or integrating more complex real-world scenarios. Additionally, addressing current limitations like artifact generation around fine details could further refine its utility and application scope.

In conclusion, Wear-Any-Way represents a milestone in virtual try-on technology, significantly bridging the gap between static garment presentation and dynamic, user-defined stylization. Its development not only elevates the shopping experience but also inspires continued innovation in the intersection of AI and fashion technology.

Markdown Report Issue