Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

DressCode: Autoregressively Sewing and Generating Garments from Text Guidance (2401.16465v4)

Published 29 Jan 2024 in cs.CV and cs.GR

Abstract: Apparel's significant role in human appearance underscores the importance of garment digitalization for digital human creation. Recent advances in 3D content creation are pivotal for digital human creation. Nonetheless, garment generation from text guidance is still nascent. We introduce a text-driven 3D garment generation framework, DressCode, which aims to democratize design for novices and offer immense potential in fashion design, virtual try-on, and digital human creation. We first introduce SewingGPT, a GPT-based architecture integrating cross-attention with text-conditioned embedding to generate sewing patterns with text guidance. We then tailor a pre-trained Stable Diffusion to generate tile-based Physically-based Rendering (PBR) textures for the garments. By leveraging a LLM, our framework generates CG-friendly garments through natural language interaction. It also facilitates pattern completion and texture editing, streamlining the design process through user-friendly interaction. This framework fosters innovation by allowing creators to freely experiment with designs and incorporate unique elements into their work. With comprehensive evaluations and comparisons with other state-of-the-art methods, our method showcases superior quality and alignment with input prompts. User studies further validate our high-quality rendering results, highlighting its practical utility and potential in production settings. Our project page is https://IHe-KaiI.github.io/DressCode/.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (71)
  1. Gpt-4 technical report. arXiv preprint arXiv:2303.08774, 2023.
  2. Autodesk, INC. Maya, 2019.
  3. AUTOMATIC1111. Stable Diffusion Web UI, Aug. 2022.
  4. Estimating garment patterns from static scan data. In Computer Graphics Forum, volume 40, pages 273–287. Wiley Online Library, 2021.
  5. Physics-driven pattern adjustment for direct 3d garment editing. ACM Trans. Graph., 35(4):50–1, 2016.
  6. Parsing sewing patterns into 3d garments. Acm Transactions on Graphics (TOG), 32(4):1–12, 2013.
  7. Improving image generation with better captions. Computer Science. https://cdn. openai. com/papers/dall-e-3. pdf, 2:3, 2023.
  8. Blender Foundation. Blender, 2022.
  9. Text2tex: Text-driven texture synthesis via diffusion models. arXiv preprint arXiv:2303.11396, 2023.
  10. Fantasia3d: Disentangling geometry and appearance for high-quality text-to-3d content creation. arXiv preprint arXiv:2303.13873, 2023.
  11. Structure-preserving 3d garment modeling with neural sewing machines. Advances in Neural Information Processing Systems, 35:15147–15159, 2022.
  12. Garment modeling with a depth camera. ACM Transactions on Graphics (TOG), 34(6):1–12, 2015.
  13. Drapenet: Garment generation and self-supervised draping. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1451–1460, 2023.
  14. Hyperdiffusion: Generating implicit neural fields with weight-space diffusion. arXiv preprint arXiv:2303.17015, 2023.
  15. Data-driven garment pattern estimation from 3d geometries. Eurographics 2021-Short Papers, 2021.
  16. Denoising diffusion probabilistic models. Advances in neural information processing systems, 33:6840–6851, 2020.
  17. Tech: Text-guided reconstruction of lifelike clothed humans. arXiv preprint arXiv:2308.08545, 2023.
  18. Zero-shot text-guided object generation with dream fields. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 867–876, 2022.
  19. Garment capture from a photograph. Computer Animation and Virtual Worlds, 26(3-4):291–300, 2015.
  20. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114, 2013.
  21. Generating datasets of 3d garments with sewing patterns. In J. Vanschoren and S. Yeung, editors, Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks, volume 1, 2021.
  22. Neuraltailor: Reconstructing sewing pattern structures from 3d point clouds of garments. ACM Transactions on Graphics (TOG), 41(4):1–16, 2022.
  23. Isp: Multi-layered garment draping with implicit sewing patterns. arXiv preprint arXiv:2305.14100, 2023.
  24. Tada! text to animatable digital avatars. arXiv preprint arXiv:2308.10899, 2023.
  25. Magic3d: High-resolution text-to-3d content creation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 300–309, 2023.
  26. Towards garment sewing pattern reconstruction from a single image. ACM Transactions on Graphics (SIGGRAPH Asia), 2023.
  27. One-2-3-45: Any single image to 3d mesh in 45 seconds without per-shape optimization. arXiv preprint arXiv:2306.16928, 2023.
  28. Zero-1-to-3: Zero-shot one image to 3d object. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 9298–9309, 2023.
  29. Syncdreamer: Generating multiview-consistent images from a single-view image. arXiv preprint arXiv:2309.03453, 2023.
  30. Wonder3d: Single image to 3d using cross-domain diffusion. arXiv preprint arXiv:2310.15008, 2023.
  31. Repaint: Inpainting using denoising diffusion probabilistic models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11461–11471, 2022.
  32. Realfusion: 360deg reconstruction of any object from a single image. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8446–8455, 2023.
  33. Latent-nerf for shape-guided generation of 3d shapes and textures. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12663–12673, 2023.
  34. Nerf: Representing scenes as neural radiance fields for view synthesis. Communications of the ACM, 65(1):99–106, 2021.
  35. Extracting triangular 3d models, materials, and lighting from images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8280–8290, 2022.
  36. Polygen: An autoregressive generative model of 3d meshes. In International conference on machine learning, pages 7220–7229. PMLR, 2020.
  37. Computational pattern making from 3d garment models. ACM Transactions on Graphics (TOG), 41(4):1–14, 2022.
  38. Dreamfusion: Text-to-3d using 2d diffusion. arXiv, 2022.
  39. Personaltailor: Personalizing 2d pattern design from 3d garment point clouds. arXiv preprint arXiv:2303.09695, 2023.
  40. Magic123: One image to high-quality 3d object generation using both 2d and 3d diffusion priors. arXiv preprint arXiv:2306.17843, 2023.
  41. Richdreamer: A generalizable normal-depth diffusion model for detail richness in text-to-3d. arXiv preprint arXiv:2311.16918, 2023.
  42. Learning transferable visual models from natural language supervision. In International conference on machine learning, pages 8748–8763. PMLR, 2021.
  43. Dreambooth3d: Subject-driven text-to-3d generation. arXiv preprint arXiv:2303.13508, 2023.
  44. Texture: Text-guided texturing of 3d shapes. arXiv preprint arXiv:2302.01721, 2023.
  45. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10684–10695, 2022.
  46. U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention, pages 234–241. Springer, 2015.
  47. Let 2d diffusion model know 3d-consistency for robust text-to-3d generation. arXiv preprint arXiv:2303.07937, 2023.
  48. Variational surface cutting. ACM Transactions on Graphics (TOG), 37(4):1–13, 2018.
  49. Deep marching tetrahedra: a hybrid representation for high-resolution 3d shape synthesis. Advances in Neural Information Processing Systems, 34:6087–6101, 2021.
  50. Gan-based garment generation using sewing pattern images. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XVIII 16, pages 225–247. Springer, 2020.
  51. Zero123++: a single image to consistent multi-view diffusion base model. arXiv preprint arXiv:2310.15110, 2023.
  52. Mvdream: Multi-view diffusion for 3d generation. arXiv preprint arXiv:2308.16512, 2023.
  53. Meshgpt: Generating triangle meshes with decoder-only transformers. arXiv preprint arXiv:2311.15475, 2023.
  54. Mulaycap: Multi-layer human performance capture using a monocular video camera. IEEE Transactions on Visualization and Computer Graphics, 28(4):1862–1879, 2020.
  55. Deepcloth: Neural garment representation for shape and style editing. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(2):1581–1593, 2022.
  56. Make-it-3d: High-fidelity 3d creation from a single image with diffusion prior. arXiv preprint arXiv:2303.14184, 2023.
  57. Textmesh: Generation of realistic 3d meshes from text prompts. arXiv preprint arXiv:2304.12439, 2023.
  58. Sensitive couture for interactive garment modeling and editing. ACM Trans. Graph., 30(4):90, 2011.
  59. Score jacobian chaining: Lifting pretrained 2d diffusion models for 3d generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12619–12629, 2023.
  60. Learning a shared shape space for multimodal garment design. ACM Transactions on Graphics, 37(6):1–13, 2018.
  61. Prolificdreamer: High-fidelity and diverse text-to-3d generation with variational score distillation. arXiv preprint arXiv:2305.16213, 2023.
  62. Hyperdreamer: Hyper-realistic 3d content generation and editing from a single image. In SIGGRAPH Asia 2023 Conference Papers, pages 1–10, 2023.
  63. Neurallift-360: Lifting an in-the-wild 2d photo to a 3d object with 360deg views. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4479–4489, 2023.
  64. Physics-inspired garment recovery from a single-view image. ACM Transactions on Graphics (TOG), 37(5):1–14, 2018.
  65. Consistent-1-to-3: Consistent image to 3d view synthesis via geometry-aware diffusion models. arXiv preprint arXiv:2310.03020, 2023.
  66. Surf-d: High-quality surface generation for arbitrary topologies using diffusion models. arXiv preprint arXiv:2311.17050, 2023.
  67. Dreamface: Progressive generation of animatable 3d faces under text guidance. arXiv preprint arXiv:2304.03117, 2023.
  68. Adding conditional control to text-to-image diffusion models. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 3836–3847, 2023.
  69. Efficientdreamer: High-fidelity and robust 3d creation via orthogonal-view diffusion prior. arXiv preprint arXiv:2308.13223, 2023.
  70. Groomgen: A high-quality generative hair model using hierarchical latent representations. ACM Transactions on Graphics (TOG), 42(6):1–16, 2023.
  71. Deep fashion3d: A dataset and benchmark for 3d garment reconstruction from single images. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part I 16, pages 512–530. Springer, 2020.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Kai He (38 papers)
  2. Kaixin Yao (7 papers)
  3. Qixuan Zhang (25 papers)
  4. Jingyi Yu (171 papers)
  5. Lingjie Liu (79 papers)
  6. Lan Xu (102 papers)
Citations (11)

Summary

An Examination of \LaTeX\ Guidelines for Author Responses

This document provides a comprehensive framework for preparing author responses to paper reviews in academic settings, with a specific focus on maintaining consistency and clarity in rebuttals submitted following the CVPR conference guidelines.

Key Elements of the Author Response Guidelines

The primary intention of the author response is to enable authors to address factual errors or provide additional clarification as requested by reviewers, without introducing new contributions. Key elements of these guidelines include:

  1. Response Constraints: Author responses are strictly limited to one page in PDF format. This constraint includes all content such as text, figures, and references. The objective is to ensure conciseness and focus on essential clarifications.
  2. Content Restrictions: Authors are instructed to refrain from introducing new experiments or contributions unless explicitly requested by reviewers. This maintains the integrity of the original submission and prevents an extension of the review process beyond the scope of the initial evaluation.
  3. Formatting Specifications: The document outlines precise formatting requirements:
    • A two-column layout with specified dimensions and margins.
    • Indentation and font specifications for consistency.
    • The necessity to number equations, figures, and references uniquely in the author response to avoid confusion.
  4. Anonymity Requirement: The guidelines emphasize that author anonymity must be preserved, prohibiting external links or identifiers that reveal author identity.

Implications for Academic Publishing

These guidelines have several implications for the academic publishing process:

  • Consistency and Fairness: By adhering to a standard format and limiting content, the guidelines ensure all authors and reviewers engage with responses uniformly, promoting fairness in evaluating rebuttals.
  • Efficiency in Review Processes: Specifying constraints optimizes the review process, allowing reviewers to efficiently assess responses without being burdened by extensive new data or experiments.
  • Quality Control: Maintaining strict adherence to formatting and content rules aids in preserving the quality and professionalism of academic discourse.

Future Considerations

Looking forward, these guidelines could be further adapted to incorporate advancements in digital review processes. For instance, as collaboration tools evolve, there may be opportunities to integrate interactive features within responses while maintaining brevity and clarity. Additionally, the balance between providing sufficient detail and adhering to page limits might be reassessed as digital mediums evolve, potentially allowing for more dynamic review mechanisms. However, any adjustments must continue to prioritize the principles of fairness, clarity, and objectivity that underpin current practices.

In conclusion, this document serves as a critical resource for authors aiming to effectively communicate with reviewers during the paper evaluation process, emphasizing clarity, conciseness, and adherence to standards that facilitate an efficient and fair review system.

Github Logo Streamline Icon: https://streamlinehq.com
Youtube Logo Streamline Icon: https://streamlinehq.com