Papers
Topics
Authors
Recent
2000 character limit reached

X-Ray: A Sequential 3D Representation For Generation

Published 22 Apr 2024 in cs.CV | (2404.14329v2)

Abstract: We introduce X-Ray, a novel 3D sequential representation inspired by the penetrability of x-ray scans. X-Ray transforms a 3D object into a series of surface frames at different layers, making it suitable for generating 3D models from images. Our method utilizes ray casting from the camera center to capture geometric and textured details, including depth, normal, and color, across all intersected surfaces. This process efficiently condenses the whole 3D object into a multi-frame video format, motivating the utilize of a network architecture similar to those in video diffusion models. This design ensures an efficient 3D representation by focusing solely on surface information. Also, we propose a two-stage pipeline to generate 3D objects from X-Ray Diffusion Model and Upsampler. We demonstrate the practicality and adaptability of our X-Ray representation by synthesizing the complete visible and hidden surfaces of a 3D object from a single input image. Experimental results reveal the state-of-the-art superiority of our representation in enhancing the accuracy of 3D generation, paving the way for new 3D representation research and practical applications.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (40)
  1. Pu-dense: Sparse tensor-based point cloud geometry upsampling. IEEE Trans. Image Process., 2022.
  2. DeepFloyd Lab at StabilityAI. DeepFloyd IF: a novel state-of-the-art open-source text-to-image model with a high degree of photorealism and language understanding. https://www.deepfloyd.ai/deepfloyd-if, 2023. Retrieved on 2023-11-08.
  3. Stable video diffusion: Scaling latent video diffusion models to large datasets. CoRR, 2023.
  4. Efficient geometry-aware 3D generative adversarial networks. In CVPR, 2022.
  5. Objaverse: A universe of annotated 3d objects. arXiv preprint arXiv:2212.08051, 2022.
  6. Taming transformers for high-resolution image synthesis, 2020.
  7. Get3d: A generative model of high quality 3d textured shapes learned from images. In Advances In Neural Information Processing Systems, 2022.
  8. Ref-neus: Ambiguity-reduced neural implicit surface learning for multi-view reconstruction with reflection. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 4251–4260, 2023.
  9. Generative sparse detection networks for 3d single-shot object detection. In European conference on computer vision, 2020.
  10. Openlrm: Open-source large reconstruction models. https://github.com/3DTopia/OpenLRM, 2023.
  11. LRM: large reconstruction model for single image to 3d. CoRR, abs/2311.04400, 2023.
  12. Efficientnerf efficient neural radiance fields. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 12902–12911, 2022.
  13. Trivol: Point cloud rendering via triple volumes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 20732–20741, 2023a.
  14. Point2pix: Photo-realistic point cloud rendering via neural radiance fields. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 8349–8358, 2023b.
  15. Make-a-shape: a ten-million-scale 3d shape model. CoRR, 2024.
  16. Screened poisson surface reconstruction. ACM Trans. Graph., 2013.
  17. 3d gaussian splatting for real-time radiance field rendering. ACM Transactions on Graphics, 42(4), 2023.
  18. Segment anything. arXiv:2304.02643, 2023.
  19. Point cloud upsampling via disentangled refinement. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021.
  20. Neural sparse voxel fields. NeurIPS, 2020.
  21. Point-voxel CNN for efficient 3d deep learning. In NeurIPS, 2019.
  22. Diffusion probabilistic models for 3d point cloud generation. In CVPR, 2021.
  23. Videofusion: Decomposed diffusion models for high-quality video generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
  24. Local light field fusion: Practical view synthesis with prescriptive sampling guidelines. ACM Transactions on Graphics (TOG), 2019.
  25. Nerf: Representing scenes as neural radiance fields for view synthesis. In ECCV, 2020.
  26. Fast training of diffusion transformer with extreme masking for 3d point clouds generation. arXiv preprint arXiv: 2312.07231, 2023.
  27. Point-e: A system for generating 3d point clouds from complex prompts. CoRR, abs/2212.08751, 2022.
  28. Scalable diffusion models with transformers. In ICCV, 2023.
  29. Wuerstchen: An efficient architecture for large-scale text-to-image diffusion models, 2023.
  30. Pointnet++: Deep hierarchical feature learning on point sets in a metric space. In NeurIPS, 2017.
  31. High-resolution image synthesis with latent diffusion models. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, June 18-24, 2022, 2022.
  32. Photorealistic text-to-image diffusion models with deep language understanding. In NeurIPS, 2022.
  33. Plenoxels: Radiance fields without neural networks. In CVPR, 2022.
  34. Lgm: Large multi-view gaussian model for high-resolution 3d content creation. arXiv preprint arXiv:2402.05054, 2024.
  35. Triposr: Fast 3d object reconstruction from a single image. arXiv preprint arXiv:2403.02151, 2024.
  36. Single-view view synthesis with multiplane images. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020.
  37. Neus: Learning neural implicit surfaces by volume rendering for multi-view reconstruction. NeurIPS, 2021.
  38. DMV3D: denoising multi-view diffusion using 3d large reconstruction model. CoRR, abs/2311.09217, 2023.
  39. PlenOctrees for real-time rendering of neural radiance fields. In ICCV, 2021.
  40. Pu-net: Point cloud upsampling network. In CVPR, 2018.
Citations (1)

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.