Multi-Level Neural Scene Graphs for Dynamic Urban Environments (2404.00168v1)
Abstract: We estimate the radiance field of large-scale dynamic areas from multiple vehicle captures under varying environmental conditions. Previous works in this domain are either restricted to static environments, do not scale to more than a single short video, or struggle to separately represent dynamic object instances. To this end, we present a novel, decomposable radiance field approach for dynamic urban environments. We propose a multi-level neural scene graph representation that scales to thousands of images from dozens of sequences with hundreds of fast-moving objects. To enable efficient training and rendering of our representation, we develop a fast composite ray sampling and rendering scheme. To test our approach in urban driving scenarios, we introduce a new, novel view synthesis benchmark. We show that our approach outperforms prior art by a significant margin on both established and our proposed benchmark while being faster in training and rendering.
- 3d scene graph: A structure for unified semantics, 3d space, and camera. In ICCV, 2019.
- Mip-nerf 360: Unbounded anti-aliased neural radiance fields. In CVPR, 2022.
- Introduction to implicit surfaces. 1997.
- Virtual kitti 2, 2020.
- Past, present, and future of simultaneous localization and mapping: Toward the robust-perception age. IEEE Transactions on robotics, 32(6):1309–1332, 2016.
- nuscenes: A multimodal dataset for autonomous driving. In CVPR, 2020.
- Shapenet: An information-rich 3d model repository. arXiv preprint arXiv:1512.03012, 2015.
- Lessons from scene graphs: using scene graphs to teach hierarchical modeling. Computers & Graphics, 25(4):703–711, 2001.
- Real‐time slam with octree evidence grids for exploration in underwater tunnels. Journal of Field Robotics, 24, 2007.
- Fast dynamic radiance fields with time-aware neural voxels. In SIGGRAPH Asia 2022 Conference Papers, 2022.
- Plenoxels: Radiance fields without neural networks. In CVPR, 2022.
- Dynamic view synthesis from dynamic monocular video. In CVPR, 2021.
- Monocular dynamic view synthesis: A reality check. NeurIPS, 2022.
- Surfelnerf: Neural surfel radiance fields for online photorealistic reconstruction of indoor scenes. In CVPR, 2023.
- Are we ready for autonomous driving? the kitti vision benchmark suite. In CVPR, 2012.
- Michael Kaess. Simultaneous localization and mapping with infinite planes. 2015 IEEE International Conference on Robotics and Automation (ICRA), pages 4605–4611, 2015.
- 3d gaussian splatting for real-time radiance field rendering. ACM Transactions on Graphics (ToG), 42(4):1–14, 2023.
- Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
- Panoptic neural fields: A semantic object-aware neural scene representation. In CVPR, 2022.
- Neural scene flow fields for space-time view synthesis of dynamic scenes. In CVPR, 2021.
- Dynibar: Neural dynamic image-based rendering. In CVPR, 2023.
- Barf: Bundle-adjusting neural radiance fields. In ICCV, 2021.
- Real-time neural rasterization for large scenes. In ICCV, 2023.
- Neural sparse voxel fields. NeurIPS, 2020.
- Urban radiance field representation with deformable neural mesh primitives. In ICCV, 2023.
- Visual navigation using heterogeneous landmarks and unsupervised geometric constraints. IEEE Transactions on Robotics, 31:736–749, 2015.
- Track to reconstruct and reconstruct to track. IEEE Robotics and Automation Letters, 5(2):1803–1810, 2020.
- Dynamic 3d gaussians: Tracking by persistent dynamic view synthesis. arXiv preprint arXiv:2308.09713, 2023.
- A ray-box intersection algorithm and efficient dynamic voxel rendering. Journal of Computer Graphics Techniques Vol, 7(3):66–81, 2018.
- Nerf in the wild: Neural radiance fields for unconstrained photo collections. In CVPR, 2021.
- Occupancy networks: Learning 3d reconstruction in function space. In CVPR, 2019.
- Nerf: Representing scenes as neural radiance fields for view synthesis. Communications of the ACM, 65(1):99–106, 2021.
- Instant neural graphics primitives with a multiresolution hash encoding. ACM Transactions on Graphics (ToG), 41(4):1–15, 2022.
- Differentiable volumetric rendering: Learning implicit 3d representations without 3d supervision. In CVPR, 2020.
- OpenStreetMap contributors. Planet dump retrieved from https://planet.osm.org . https://www.openstreetmap.org, 2017.
- Neural scene graphs for dynamic scenes. In CVPR, 2021.
- Simpletrack: Understanding and rethinking 3d multi-object tracking. arXiv preprint arXiv:2111.09621, 2021.
- Point-dynrf: Point-based dynamic radiance fields from a monocular video, 2023.
- Hypernerf: A higher-dimensional representation for topologically varying neural radiance fields. arXiv preprint arXiv:2106.13228, 2021.
- Temporal interpolation is all you need for dynamic neural radiance fields. In CVPR, 2023.
- Pytorch: An imperative style, high-performance deep learning library. NeurIPS, 2019.
- Detailed real-time urban 3d reconstruction from video. IJCV, 78:143–167, 2007.
- D-nerf: Neural radiance fields for dynamic scenes. In CVPR, 2021.
- Urban radiance fields. In CVPR, 2022.
- Stable view synthesis. In CVPR, 2021.
- 3d dynamic scene graphs: Actionable spatial perception with places, objects, and humans. arXiv preprint arXiv:2002.06289, 2020.
- Nerf for outdoor scene relighting. In ECCV, 2022.
- Slam++: Simultaneous localisation and mapping at the level of objects. In CVPR, 2013.
- R3d3: Dense 3d reconstruction of dynamic scenes from multiple cameras. In ICCV, 2023.
- Interpolating and approximating implicit surfaces from polygon soup. In ACM SIGGRAPH 2004 Papers, page 896–904, 2004.
- Scene representation networks: Continuous 3d-structure-aware neural scene representations. NeurIPS, 2019.
- Nerfplayer: A streamable dynamic scene representation with decomposed neural radiance fields. IEEE Transactions on Visualization and Computer Graphics, 29(5):2732–2742, 2023.
- Henry Sowizral. Scene graphs in the new millennium. IEEE Computer Graphics and Applications, 20(1):56–57, 2000.
- Direct voxel grid optimization: Super-fast convergence for radiance fields reconstruction. In CVPR, 2022.
- Block-nerf: Scalable large scene neural view synthesis. In CVPR, 2022.
- Nerfstudio: A modular framework for neural radiance field development. In ACM SIGGRAPH 2023 Conference Proceedings, pages 1–12, 2023.
- Raft: Recurrent all-pairs field transforms for optical flow. In ECCV, 2020.
- Non-rigid neural radiance fields: Reconstruction and novel view synthesis of a dynamic scene from monocular video. In ICCV, 2021.
- Factoring shape, pose, and layout from the 2d image of a 3d scene. In CVPR, 2018.
- Mega-nerf: Scalable construction of large-scale nerfs for virtual fly-throughs. In CVPR, 2022.
- Suds: Scalable urban dynamic scenes. In CVPR, 2023.
- Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing, 13(4):600–612, 2004.
- Neural fields meet explicit geometric representations for inverse rendering of urban scenes. In CVPR, 2023.
- Argoverse 2: Next generation datasets for self-driving perception and forecasting. arXiv preprint arXiv:2301.00493, 2023.
- D^ 2nerf: Self-supervised decoupling of dynamic and static objects from a monocular video. NeurIPS, 2022.
- Space-time neural irradiance fields for free-viewpoint video. In CVPR, 2021.
- S-nerf: Neural radiance fields for street views. arXiv preprint arXiv:2303.00749, 2023.
- Point-nerf: Point-based neural radiance fields. In CVPR, 2022.
- Banmo: Building animatable 3d neural models from many casual videos. In CVPR, 2022.
- Unisim: A neural closed-loop sensor simulator. In CVPR, 2023.
- Center-based 3d object detection and tracking. In CVPR, 2021.
- The unreasonable effectiveness of deep features as a perceptual metric. In CVPR, 2018.