Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

CNS-Edit: 3D Shape Editing via Coupled Neural Shape Optimization (2402.02313v1)

Published 4 Feb 2024 in cs.CV and cs.GR

Abstract: This paper introduces a new approach based on a coupled representation and a neural volume optimization to implicitly perform 3D shape editing in latent space. This work has three innovations. First, we design the coupled neural shape (CNS) representation for supporting 3D shape editing. This representation includes a latent code, which captures high-level global semantics of the shape, and a 3D neural feature volume, which provides a spatial context to associate with the local shape changes given by the editing. Second, we formulate the coupled neural shape optimization procedure to co-optimize the two coupled components in the representation subject to the editing operation. Last, we offer various 3D shape editing operators, i.e., copy, resize, delete, and drag, and derive each into an objective for guiding the CNS optimization, such that we can iteratively co-optimize the latent code and neural feature volume to match the editing target. With our approach, we can achieve a rich variety of editing results that are not only aware of the shape semantics but are also not easy to achieve by existing approaches. Both quantitative and qualitative evaluations demonstrate the strong capabilities of our approach over the state-of-the-art solutions.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (81)
  1. Image2StyleGAN: How to embed images into the stylegan latent space?. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 4432–4441.
  2. StyleFlow: Attribute-conditioned exploration of stylegan-generated images using conditional continuous normalizing flows. In ACM Transactions on Graphics (SIGGRAPH), Vol. 40. 1–21.
  3. ChangeIt3D: Language-Assisted 3D Shape Edits and Deformations. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
  4. Text2LIVE: Text-driven layered image and video editing. In European Conference on Computer Vision (ECCV). 707–723.
  5. InstructPix2Pix: Learning to follow image editing instructions. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 18392–18402.
  6. ShapeNet: An information-rich 3D model repository. arXiv preprint arXiv:1512.03012 (2015).
  7. BSP-Net: Generating compact meshes via binary space partitioning. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 45–54.
  8. Zhiqin Chen and Hao Zhang. 2019. Learning implicit fields for generative shape modeling. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 5939–5948.
  9. Navigating the GAN parameter space for semantic image editing. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 3671–3680.
  10. Diffusion-SDF: Conditional generative modeling of signed distance functions. In IEEE International Conference on Computer Vision (ICCV). 2262–2272.
  11. Objaverse-XL: A universe of 10M+ 3D objects. arXiv preprint arXiv:2307.05663 (2023).
  12. Objaverse: A universe of annotated 3D objects. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 13142–13153.
  13. DragVideo: Interactive Drag-style Video Editing. arXiv:cs.GR/2312.02216
  14. ShapeCrafter: A recursive text-conditioned 3D shape generation model. Conference on Neural Information Processing Systems (NeurIPS).
  15. MRGAN: Multi-rooted 3D shape generation with unsupervised part disentanglement. In In ICCV Workshop on Structural and Compositional Learning on 3D Data (StruCo3D). 2039–2048.
  16. iWIRES: an analyze-and-edit approach to shape manipulation. ACM Transactions on Graphics (SIGGRAPH) (2009).
  17. SketchSampler: Sketch-Based 3D Reconstruction via View-Dependent Depth Sampling. In European Conference on Computer Vision (ECCV). 464–479.
  18. GET3D: A Generative Model of High Quality 3D Textured Shapes Learned from Images. In Conference on Neural Information Processing Systems (NeurIPS).
  19. GANalyze: Toward visual definitions of cognitive image properties. In IEEE International Conference on Computer Vision (ICCV). 5744–5753.
  20. Generative Adversarial Nets. In Conference on Neural Information Processing Systems (NeurIPS). 2672–2680.
  21. Sketch2Mesh: Reconstructing and editing 3D shapes from sketches. In IEEE International Conference on Computer Vision (ICCV). 13023–13032.
  22. DualSDF: Semantic shape manipulation using a two-level representation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 7631–7641.
  23. PointGMM: A neural GMM network for point clouds. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 12054–12063.
  24. Prompt-to-Prompt image editing with cross attention control. arXiv preprint arXiv:2208.01626 (2022).
  25. SPAGHETTI: Editing Implicit Shapes Through Part Aware Generation. arXiv preprint arXiv:2201.13168 (2022).
  26. Denoising Diffusion Probabilistic Models. Conference on Neural Information Processing Systems (NeurIPS) (2020), 6840–6851.
  27. Neural Wavelet-domain Diffusion for 3D Shape Generation, Inversion, and Manipulation. ACM Transactions on Graphics (TOG) (2023).
  28. CLIPXPlore: Coupled CLIP and Shape Spaces for 3D Shape Exploration. In Proceedings of SIGGRAPH Asia. 1–12.
  29. LADIS: Language disentanglement for 3D shape editing. In In Findings of Empirical Methods in Natural Language Processing (EMNLP).
  30. Neural Wavelet-domain Diffusion for 3D Shape Generation. In Proceedings of SIGGRAPH Asia. 1–9.
  31. Neural Template: Topology-aware Reconstruction and Disentangled Generation of 3D Meshes. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 18572–18582.
  32. Progressive point cloud deconvolution generation network. In European Conference on Computer Vision (ECCV). 397–413.
  33. 3D Shape Generation With Grid-Based Implicit Functions. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 13559–13568.
  34. ShapeFlow: Learnable deformation flows among 3D shapes. Conference on Neural Information Processing Systems (NeurIPS), 9745–9757.
  35. Harmonic coordinates for character articulation. ACM Transactions on Graphics 26, 3 (2007), 71–es.
  36. Mean value coordinates for closed triangular meshes. ACM Transactions on Graphics 24, 3 (2005), 561–566.
  37. Imagic: Text-based real image editing with diffusion models. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 6007–6017.
  38. ABC: A big cad model dataset for geometric deep learning. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 9601–9611.
  39. SALAD: Part-level latent diffusion for 3D shape generation and manipulation. In IEEE International Conference on Computer Vision (ICCV). 14441–14451.
  40. SP-GAN: Sphere-Guided 3D Shape Generation and Manipulation. ACM Transactions on Graphics (SIGGRAPH) 40, 4 (2021).
  41. Magic3D: High-resolution text-to-3D content creation. In Conference on Neural Information Processing Systems (NeurIPS). 300–309.
  42. Green coordinates. ACM Transactions on Graphics 27, 3 (2008), 1–10.
  43. Differential Coordinates for Interactive Mesh Editing. In Proceedings of IEEE International Conference on Shape Modeling and Applications. 181–190.
  44. DeepMetaHandles: Learning deformation meta-handles of 3D meshes with biharmonic coordinates. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 12–21.
  45. ISS: Image as a step stone for text-guided 3D shape generation. In International Conference on Learning Representations (ICLR).
  46. MeshDiffusion: Score-based Generative 3D Mesh Modeling. In International Conference on Learning Representations.
  47. EXIM: A Hybrid Explicit-Implicit Representation for Text-Guided 3D Shape Generation. ACM Transactions on Graphics (SIGGRAPH Asia) 42, 6 (2023), 1–12.
  48. Towards implicit text-guided 3D shape generation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 17896–17906.
  49. SMPL: A Skinned Multi-Person Linear model. ACM Transactions on Graphics (SIGGRAPH Asia) 34, 6 (2015), 248:1–248:16.
  50. Shitong Luo and Wei Hu. 2021. Diffusion probabilistic models for 3D point cloud generation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2837–2845.
  51. Controllable Mesh Generation Through Sparse Latent Point Diffusion Models. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 271–280.
  52. Occupancy networks: Learning 3D reconstruction in function space. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 4460–4470.
  53. NeRF: Representing scenes as neural radiance fields for view synthesis. Commun. ACM 65, 1 (2021), 99–106.
  54. Structure-aware shape processing. In Eurographics State-of-the-art Report (STAR).
  55. DragonDiffusion: Enabling drag-style manipulation on diffusion models. arXiv preprint arXiv:2307.02421 (2023).
  56. Point-E: A system for generating 3D point clouds from complex prompts. arXiv preprint arXiv:2212.08751 (2022).
  57. Drag your GAN: Interactive point-based manipulation on the generative image manifold. In Proceedings of SIGGRAPH. 1–11.
  58. Zero-shot image-to-image translation. In Proceedings of SIGGRAPH. 1–11.
  59. DreamFusion: Text-to-3D using 2D Diffusion. In International Conference on Learning Representations (ICLR).
  60. XCube: Large-Scale 3D Generative Modeling using Sparse Voxel Hierarchies. arXiv preprint (2023).
  61. High-resolution image synthesis with latent diffusion models. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 10684–10695.
  62. Interpreting the latent space of GANs for semantic face editing. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 9243–9252.
  63. Yujun Shen and Bolei Zhou. 2021. Closed-form factorization of latent semantics in GANs. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1532–1540.
  64. DragDiffusion: Harnessing Diffusion Models for Interactive Point-based Image Editing. arXiv preprint arXiv:2306.14435 (2023).
  65. MeshGPT: Generating Triangle Meshes with Decoder-Only Transformers. arXiv preprint arXiv:2311.15475 (2023).
  66. Edward J. Smith and David Meger. 2017. Improved adversarial systems for 3D object generation and reconstruction. In Conference on Robot Learning. 87–96.
  67. Olga Sorkine and Marc Alexa. 2007. As-Rigid-As-Possible surface modeling. In Symposium on Geometry processing, Vol. 4. 109–116.
  68. Laplacian surface editing. In Eurographics Symposium on Geometry Processing (SGP). 175–184.
  69. Andrey Voynov and Artem Babenko. 2020. Unsupervised discovery of interpretable directions in the GAN latent space. In Proceedings of International Conference on Machine Learning (ICML). 9786–9796.
  70. 3DN: 3D Deformation Network. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1038–1046.
  71. Symmetry Hierarchy of Man-Made Objects. Computer Graphics Forum (Eurographics) 30, 2 (2011), 287–296.
  72. Learning a probabilistic latent space of object shapes via 3D generative-adversarial modeling. In Conference on Neural Information Processing Systems (NeurIPS). 82–90.
  73. Neural Cages for detail-preserving 3D deformations. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 75–83.
  74. COALESCE: Component Assembly by Learning to Synthesize Connections. In Proc. of 3DV.
  75. A Revisit of Shape Editing Techniques: from the Geometric to the Neural Viewpoint. arXiv:cs.GR/2103.01694
  76. LION: Latent Point Diffusion Models for 3D Shape Generation. In Conference on Neural Information Processing Systems (NeurIPS).
  77. 3DShape2VecSet: A 3D Shape Representation for Neural Fields and Generative Diffusion Models. ACM Transactions on Graphics (SIGGRAPH) 42, 4 (2023), 1-16.
  78. Sketch2Model: View-aware 3D modeling from single free-hand sketches. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 6012–6021.
  79. Locally Attentional SDF Diffusion for Controllable 3D Shape Generation. ACM Transactions on Graphics (SIGGRAPH) 42, 4 (2023).
  80. Component-wise Controllers for Structure-Preserving Shape Manipulation. Computer Graphics Forum (Eurographics) (2011).
  81. 3D shape generation and completion through point-voxel diffusion. In IEEE International Conference on Computer Vision (ICCV). 5826–5835.
Citations (3)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets