Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

NEAT: Distilling 3D Wireframes from Neural Attraction Fields (2307.10206v2)

Published 14 Jul 2023 in cs.CV and cs.GR

Abstract: This paper studies the problem of structured 3D reconstruction using wireframes that consist of line segments and junctions, focusing on the computation of structured boundary geometries of scenes. Instead of leveraging matching-based solutions from 2D wireframes (or line segments) for 3D wireframe reconstruction as done in prior arts, we present NEAT, a rendering-distilling formulation using neural fields to represent 3D line segments with 2D observations, and bipartite matching for perceiving and distilling of a sparse set of 3D global junctions. The proposed {NEAT} enjoys the joint optimization of the neural fields and the global junctions from scratch, using view-dependent 2D observations without precomputed cross-view feature matching. Comprehensive experiments on the DTU and BlendedMVS datasets demonstrate our NEAT's superiority over state-of-the-art alternatives for 3D wireframe reconstruction. Moreover, the distilled 3D global junctions by NEAT, are a better initialization than SfM points, for the recently-emerged 3D Gaussian Splatting for high-fidelity novel view synthesis using about 20 times fewer initial 3D points. Project page: \url{https://xuenan.net/neat}.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (52)
  1. Large-scale data for multiple-view stereopsis. Int. J. Comput. Vis., 120(2):153–168, 2016.
  2. Mip-nerf: A multiscale representation for anti-aliasing neural radiance fields. In IEEE/CVF International Conference on Computer Vision (ICCV), pages 5835–5844, 2021.
  3. Moving in stereo: Efficient structure and motion using lines. In IEEE/CVF International Conference on Computer Vision (ICCV), pages 1741–1748, 2009.
  4. Blender Online Community. Blender - a 3D modelling and rendering package. Blender Foundation, Stichting Blender Foundation, Amsterdam, 2018.
  5. Scannet: Richly-annotated 3d reconstructions of indoor scenes. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 2432–2443, 2017.
  6. Omnidata: A scalable pipeline for making multi-task mid-level vision datasets from 3d scans. In IEEE/CVF International Conference on Computer Vision (ICCV), pages 10766–10776, 2021.
  7. Density-based spatial clustering of applications with noise. In International Conference on Knowledge Discovery and Data Mining (KDD), 1996.
  8. Panoptic nerf: 3d-to-2d label transfer for panoptic urban scene segmentation. In International Conference on 3D Vision (3DV), pages 1–11, 2022.
  9. nerf2nerf: Pairwise registration of neural radiance fields. In IEEE International Conference on Robotics and Automation (ICRA), 2023.
  10. Adolfo Guzmán. Decomposition of a visual scene into three-dimensional bodies. In Fall Joint Computer Conference, pages 291–304, 1968.
  11. Multiple View Geometry in Computer Vision. Cambridge university press, 2003.
  12. Efficient 3d scene abstraction using line segments. Comput. Vis. Image Underst., 157:167–178, 2017.
  13. 3d gaussian splatting for real-time radiance field rendering. ACM Transactions on Graphics, 42(4), 2023.
  14. ABC: A big CAD model dataset for geometric deep learning. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 9601–9611, 2019.
  15. Panoptic neural fields: A semantic object-aware neural scene representation. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 12861–12871, 2022.
  16. Geometric reasoning for single image structure recovery. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 2136–2143, 2009.
  17. 3d line mapping revisited. In IEEE Conf. Comput. Vis. Pattern Recog., pages 21445–21455. IEEE, 2023.
  18. How-3d: Holistic 3d wireframe perception from a single image. In Int. Conf. 3D Vis., 2022.
  19. David Marr. Vision: A computational investigation into the human representation and processing of visual information. MIT press, 2010.
  20. A level set theory for neural implicit evolution under explicit flows. In European Conference on Computer Vision (ECCV), pages 711–729, 2022.
  21. NeRF: representing scenes as neural radiance fields for view synthesis. In European Conference on Computer Vision (ECCV), pages 405–421, 2020.
  22. Deepsdf: Learning continuous signed distance functions for shape representation. In IEEE Conf. Comput. Vis. Pattern Recog., 2019.
  23. SOLD2: self-supervised occlusion-aware line description and detection. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 11368–11378, 2021.
  24. Gluestick: Robust image matching by sticking points and lines together. In IEEE Conf. Comput. Vis. Pattern Recog., 2022.
  25. Deeplsd: Line segment detection and refinement with deep image gradients. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
  26. PL-SLAM: real-time monocular visual SLAM with points and lines. In IEEE International Conference on Robotics and Automation (ICRA), pages 4503–4508, 2017.
  27. Multiscale line segment detector for robust and accurate sfm. In International Conference on Pattern Recognition (ICPR), pages 2000–2005, 2016.
  28. Automatic line matching across views. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 666–671, 1997.
  29. Structure-from-motion revisited. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 4104–4113, 2016.
  30. Pixelwise view selection for unstructured multi-view stereo. In European Conference on Computer Vision (ECCV), pages 501–518, 2016.
  31. Kokichi Sugihara. A necessary and sufficient condition for a picture to represent a polyhedral scene. IEEE Trans. Pattern Anal. Mach. Intell., 6(5):578–586, 1984.
  32. Theia: A fast and scalable structure-from-motion library. In ACM International Conference on Multimedia (ACMMM), pages 693–696, 2015.
  33. Planetr: Structure-guided transformers for 3d plane recovery. In Int. Conf. Comput. Vis., pages 4166–4175, 2021.
  34. NOPE-SAC: neural one-plane RANSAC for sparse-view planar 3d reconstruction. IEEE Trans. Pattern Anal. Mach. Intell., 45(12):15233–15248, 2023.
  35. Block-nerf: Scalable large scene neural view synthesis. In IEEE Conf. Comput. Vis. Pattern Recog., 2022.
  36. LSD: A fast line segment detector with a false detection control. IEEE Trans. Pattern Anal. Mach. Intell., 32(4):722–732, 2010.
  37. Dm-nerf: 3d scene geometry decomposition and manipulation from 2d images. In International Conference on Learning Representations (ICLR), 2023.
  38. Line flow based simultaneous localization and mapping. IEEE Trans. Robotics, 37(5):1416–1432, 2021.
  39. ELSR: efficient line segment reconstruction with planes and points guidance. In IEEE Conf. Comput. Vis. Pattern Recog., pages 15786–15794, 2022.
  40. Changchang Wu. Towards linear-time incremental structure from motion. In International Conference on 3D Vision (3DV), pages 127–134, 2013.
  41. Object-compositional neural implicit surfaces. In European Conference on Computer Vision (ECCV), pages 197–213, 2022.
  42. Level-s22{}^{\mbox{2}}start_FLOATSUPERSCRIPT 2 end_FLOATSUPERSCRIPTfm: Structure from motion on neural level set of implicit surfaces. In IEEE Conf. Comput. Vis. Pattern Recog., 2023.
  43. Learning attraction field representation for robust line segment detection. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 1595–1603, 2019.
  44. Holistically-attracted wireframe parsing. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 2785–2794, 2020.
  45. Learning regional attraction for line segment detection. IEEE Trans. Pattern Anal. Mach. Intell., 43(6):1998–2013, 2021.
  46. Holistically-attracted wireframe parsing: From supervised to self-supervised learning. IEEE Trans. Pattern Anal. Mach. Intell., 45(12):14727–14744, 2023.
  47. Blendedmvs: A large-scale dataset for generalized multi-view stereo networks. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 1787–1796, 2020.
  48. Multiview neural surface reconstruction by disentangling geometry and appearance. In Advances in Neural Information Processing Systems (NeurIPS), 2020.
  49. Volume rendering of neural implicit surfaces. In Advances in Neural Information Processing Systems (NeurIPS), 2021.
  50. Monosdf: Exploring monocular geometric cues for neural implicit surface reconstruction. In Advances in Neural Information Processing Systems (NeurIPS), 2022.
  51. End-to-end wireframe parsing. In IEEE/CVF International Conference on Computer Vision (ICCV), pages 962–971, 2019a.
  52. Learning to reconstruct 3d manhattan wireframes from a single image. In IEEE/CVF International Conference on Computer Vision (ICCV), pages 7697–7706, 2019b.
Citations (2)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

Youtube Logo Streamline Icon: https://streamlinehq.com