PlanarNeRF: Online Learning of Planar Primitives with Neural Radiance Fields
Abstract: Identifying spatially complete planar primitives from visual data is a crucial task in computer vision. Prior methods are largely restricted to either 2D segment recovery or simplifying 3D structures, even with extensive plane annotations. We present PlanarNeRF, a novel framework capable of detecting dense 3D planes through online learning. Drawing upon the neural field representation, PlanarNeRF brings three major contributions. First, it enhances 3D plane detection with concurrent appearance and geometry knowledge. Second, a lightweight plane fitting module is proposed to estimate plane parameters. Third, a novel global memory bank structure with an update mechanism is introduced, ensuring consistent cross-frame correspondence. The flexible architecture of PlanarNeRF allows it to function in both 2D-supervised and self-supervised solutions, in each of which it can effectively learn from sparse training signals, significantly improving training efficiency. Through extensive experiments, we demonstrate the effectiveness of PlanarNeRF in various scenarios and remarkable improvement over existing works.
- Planeformers: From sparse view planes to 3d reconstruction. In European Conference on Computer Vision, pages 192–209. Springer, 2022.
- Neural rgb-d surface reconstruction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6290–6301, 2022.
- End-to-end object detection with transformers. In European conference on computer vision, pages 213–229. Springer, 2020.
- Monocular visual-inertial odometry with planar regularities. In 2023 IEEE International Conference on Robotics and Automation (ICRA), pages 6224–6231. IEEE, 2023.
- Per-pixel classification is not all you need for semantic segmentation. Advances in Neural Information Processing Systems, 34:17864–17875, 2021.
- Masked-attention mask transformer for universal image segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 1290–1299, 2022.
- Scannet: Richly-annotated 3d reconstructions of indoor scenes. In Proc. Computer Vision and Pattern Recognition (CVPR), IEEE, 2017.
- Unsupervised object region proposals for rgb-d indoor scenes. Computer Vision and Image Understanding, 154:127–136, 2017.
- Fast plane extraction in organized point clouds using agglomerative hierarchical clustering. In 2014 IEEE International Conference on Robotics and Automation (ICRA), pages 6218–6225. IEEE, 2014.
- Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, 24(6):381–395, 1981.
- Surfelnerf: Neural surfel radiance fields for online photorealistic reconstruction of indoor scenes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 108–118, 2023.
- Mask r-cnn. In Proceedings of the IEEE international conference on computer vision, pages 2961–2969, 2017.
- Planar surface reconstruction from sparse views. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 12991–13000, 2021.
- Michael Kaess. Simultaneous localization and mapping with infinite planes. In 2015 IEEE International Conference on Robotics and Automation (ICRA), pages 4605–4611. IEEE, 2015.
- Linear rgb-d slam for planar environments. In Proceedings of the European Conference on Computer Vision (ECCV), pages 333–348, 2018.
- Supervised fitting of geometric primitives to 3d point clouds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2652–2660, 2019.
- Planar shape based registration for multi-modal geometry. In BMVC 2021-The British Machine Vision Conference, 2021.
- Neuralangelo: High-fidelity neural surface reconstruction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8456–8465, 2023.
- Planenet: Piece-wise planar reconstruction from a single rgb image. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2579–2588, 2018.
- Planercnn: 3d plane detection and reconstruction from a single image. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4450–4459, 2019.
- Planemvs: 3d plane reconstruction from multi-view stereo. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8665–8675, 2022.
- Multi-view depth estimation using epipolar spatio-temporal networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8258–8267, 2021.
- Nerf: Representing scenes as neural radiance fields for view synthesis. Communications of the ACM, 65(1):99–106, 2021.
- Instant neural graphics primitives with a multiresolution hash encoding. ACM Transactions on Graphics (ToG), 41(4):1–15, 2022.
- Atlas: End-to-end 3d scene reconstruction from posed images. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part VII 16, pages 414–431. Springer, 2020.
- Segmentation of point clouds using smoothness constraint. International archives of photogrammetry, remote sensing and spatial information sciences, 36(5):248–253, 2006.
- 3d spatial recognition without spatially labeled 3d. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13204–13213, 2021.
- Efficient ransac for point-cloud shape detection. In Computer graphics forum, pages 214–226. Wiley Online Library, 2007.
- Parsenet: A parametric surface fitting network for 3d point clouds. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part VII 16, pages 261–276. Springer, 2020.
- Panoptic lifting for 3d scene understanding with neural fields. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9043–9052, 2023.
- Indoor segmentation and support inference from rgbd images. In Computer Vision–ECCV 2012: 12th European Conference on Computer Vision, Florence, Italy, October 7-13, 2012, Proceedings, Part V 12, pages 746–760. Springer, 2012.
- Finerecon: Depth-aware feed-forward network for detailed 3d reconstruction. arXiv preprint arXiv:2304.01480, 2023.
- The replica dataset: A digital replica of indoor spaces. arXiv preprint arXiv:1906.05797, 2019.
- Direct voxel grid optimization: Super-fast convergence for radiance fields reconstruction. In CVPR, 2022.
- Neuralrecon: Real-time coherent 3d reconstruction from monocular video. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 15598–15607, 2021.
- Planetr: Structure-guided transformers for 3d plane recovery. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 4186–4195, 2021.
- Nope-sac: Neural one-plane ransac for sparse-view planar 3d reconstruction. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023.
- Co-slam: Joint coordinate and sparse parametric encodings for neural real-time slam. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13293–13302, 2023a.
- Neus: Learning neural implicit surfaces by volume rendering for multi-view reconstruction. arXiv preprint arXiv:2106.10689, 2021.
- Neuralroom: Geometry-constrained neural implicit surfaces for indoor scene reconstruction. arXiv preprint arXiv:2210.06853, 2022.
- Neus2: Fast learning of neural implicit surfaces for multi-view reconstruction. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 3295–3306, 2023b.
- Voxurf: Voxel-based efficient and accurate neural surface reconstruction. arXiv preprint arXiv:2208.12697, 2022.
- Planarrecon: Real-time 3d plane detection and reconstruction from posed monocular videos. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6219–6228, 2022.
- Hpnet: Deep primitive segmentation using hybrid representations. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 2753–2762, 2021.
- Recovering 3d planes from a single image via convolutional neural networks. In Proceedings of the European Conference on Computer Vision (ECCV), pages 85–100, 2018.
- Self-supervised super-plane for neural 3d reconstruction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 21415–21424, 2023.
- Finding good configurations of planar primitives in unorganized point clouds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6367–6376, 2022.
- Single-image piece-wise planar 3d reconstruction via associative embedding. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1029–1037, 2019.
- Monosdf: Exploring monocular geometric cues for neural implicit surface reconstruction. Advances in neural information processing systems, 35:25018–25032, 2022.
- Lipu Zhou. Efficient second-order plane adjustment. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13113–13121, 2023.
- Robust plane-based calibration of multiple non-overlapping cameras. In 2016 Fourth International Conference on 3D Vision (3DV), pages 658–666. IEEE, 2016.
- Nice-slam: Neural implicit scalable encoding for slam. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12786–12796, 2022.
- Nicer-slam: Neural implicit scene encoding for rgb slam. arXiv preprint arXiv:2302.03594, 2023.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.