Lifelong LERF: Local 3D Semantic Inventory Monitoring Using FogROS2 (2403.10494v1)
Abstract: Inventory monitoring in homes, factories, and retail stores relies on maintaining data despite objects being swapped, added, removed, or moved. We introduce Lifelong LERF, a method that allows a mobile robot with minimal compute to jointly optimize a dense language and geometric representation of its surroundings. Lifelong LERF maintains this representation over time by detecting semantic changes and selectively updating these regions of the environment, avoiding the need to exhaustively remap. Human users can query inventory by providing natural language queries and receiving a 3D heatmap of potential object locations. To manage the computational load, we use Fog-ROS2, a cloud robotics platform, to offload resource-intensive tasks. Lifelong LERF obtains poses from a monocular RGBD SLAM backend, and uses these poses to progressively optimize a Language Embedded Radiance Field (LERF) for semantic monitoring. Experiments with 3-5 objects arranged on a tabletop and a Turtlebot with a RealSense camera suggest that Lifelong LERF can persistently adapt to changes in objects with up to 91% accuracy.
- Johannes Lutz Schönberger and Jan-Michael Frahm “Structure-from-Motion Revisited” In Conference on Computer Vision and Pattern Recognition (CVPR), 2016
- Raul Mur-Artal, Jose Maria Martinez Montiel and Juan D Tardos “ORB-SLAM: a versatile and accurate monocular SLAM system” In IEEE transactions on robotics 31.5 IEEE, 2015, pp. 1147–1163
- “DROID-SLAM: Deep Visual SLAM for Monocular, Stereo, and RGB-D Cameras” In Advances in neural information processing systems, 2021
- “LERF: Language Embedded Radiance Fields” In International Conference on Computer Vision (ICCV), 2023
- “Learning transferable visual models from natural language supervision” In International conference on machine learning, 2021, pp. 8748–8763 PMLR
- “Nerf: Representing scenes as neural radiance fields for view synthesis” In Communications of the ACM 65.1 ACM New York, NY, USA, 2021, pp. 99–106
- “Evo-nerf: Evolving nerf for sequential robot grasping of transparent objects” In 6th Annual Conference on Robot Learning, 2022
- “FogROS2-SGC: A ROS2 Cloud Robotics Platform for Secure Global Connectivity” In 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2021, pp. 2035–2042 IEEE
- “Language Embedded Radiance Fields for Zero-Shot Task-Oriented Grasping” In 7th Annual Conference on Robot Learning, 2023 URL: https://openreview.net/forum?id=k-Fg8JDQmc
- “Fusion++: Volumetric object-level slam” In 2018 international conference on 3D vision (3DV), 2018, pp. 32–41 IEEE
- “Scannet: Richly-annotated 3d reconstructions of indoor scenes” In Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 5828–5839
- “3d-mpa: Multi-proposal aggregation for 3d semantic instance segmentation” In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 9031–9040
- Ji Hou, Angela Dai and Matthias Nießner “3d-sis: 3d semantic instance segmentation of rgb-d scans” In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 4421–4430
- “3d instance segmentation via multi-task metric learning” In Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 9256–9266
- “Pointnet++: Deep hierarchical feature learning on point sets in a metric space” In Advances in neural information processing systems 30, 2017
- “Deep hough voting for 3d object detection in point clouds” In proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 9277–9286
- “Kimera: an open-source library for real-time metric-semantic localization and mapping” In 2020 IEEE International Conference on Robotics and Automation (ICRA), 2020, pp. 1689–1696 IEEE
- “3d scene graph: A structure for unified semantics, 3d space, and camera” In Proceedings of the IEEE/CVF international conference on computer vision, 2019, pp. 5664–5673
- “Scenegraphfusion: Incremental 3d scene graph prediction from rgb-d sequences” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 7515–7525
- Nathan Hughes, Yun Chang and Luca Carlone “Hydra: A real-time spatial perception system for 3D scene graph construction and optimization” In arXiv preprint arXiv:2201.13360, 2022
- “Language-driven Semantic Segmentation” In International Conference on Learning Representations, 2022 URL: https://openreview.net/forum?id=RriDjddCLN
- “Scaling open-vocabulary image segmentation with image-level labels” In European Conference on Computer Vision, 2022, pp. 540–557 Springer
- “Open-vocabulary semantic segmentation with mask-adapted clip” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 7061–7070
- “Segment Anything” In arXiv:2304.02643, 2023
- “ODISE: Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models” In arXiv preprint arXiv: 2303.04803, 2023
- “ConceptFusion: Open-set Multimodal 3D Mapping” In rss, 2023
- “Changesim: Towards end-to-end online scene change detection in industrial indoor environments” In 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2021, pp. 8578–8585 IEEE
- Lixiang Ru, Bo Du and Chen Wu “Multi-temporal scene classification and scene change detection with correlation based fusion” In IEEE Transactions on Image Processing 30 IEEE, 2020, pp. 1382–1394
- Rodrigo Caye Daudt, Bertr Le Saux and Alexandre Boulch “Fully convolutional siamese networks for change detection” In 2018 25th IEEE International Conference on Image Processing (ICIP), 2018, pp. 4063–4067 IEEE
- “3D dynamic scene graphs: Actionable spatial perception with places, objects, and humans” In arXiv preprint arXiv:2002.06289, 2020
- “3d vsg: Long-term semantic scene change prediction through 3d variable scene graphs” In 2023 IEEE International Conference on Robotics and Automation (ICRA), 2023, pp. 8179–8186 IEEE
- “Panoptic multi-tsdfs: a flexible representation for online multi-resolution volumetric mapping and long-term dynamic scene consistency” In 2022 International Conference on Robotics and Automation (ICRA), 2022, pp. 8018–8024 IEEE
- “Toward lifelong object segmentation from change detection in dense rgb-d maps” In 2013 European Conference on Mobile Robots, 2013, pp. 178–185 IEEE
- “TSDF-based change detection for consistent long-term dense reconstruction and dynamic object discovery” In 2017 IEEE International Conference on Robotics and automation (ICRA), 2017, pp. 5237–5244 IEEE
- Edith Langer, Timothy Patten and Markus Vincze “Robust and efficient object change detection by combining global semantic information and local geometric verification” In 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2020, pp. 8453–8460 IEEE
- María T Lázaro, Roberto Capobianco and Giorgio Grisetti “Efficient long-term mapping in dynamic environments” In 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2018, pp. 153–160 IEEE
- “Persistent localization and life-long mapping in changing environments using the frequency map enhancement” In 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2016, pp. 4558–4563 IEEE
- “Vision-only robot navigation in a neural radiance world” In IEEE Robotics and Automation Letters 7.2 IEEE, 2022, pp. 4606–4613
- “Mip-nerf: A multiscale representation for anti-aliasing neural radiance fields” In Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 5855–5864
- “Mip-nerf 360: Unbounded anti-aliased neural radiance fields” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 5470–5479
- “Deblur-nerf: Neural radiance fields from blurry images” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 12861–12870
- “Hdr-nerf: High dynamic range neural radiance fields” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 18398–18408
- “RobustNeRF: Ignoring Distractors with Robust Losses” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 20626–20636
- “Radiance Field Gradient Scaling for Unbiased Near-Camera Training” In arXiv preprint arXiv:2305.02756, 2023
- “Nerfstudio: A modular framework for neural radiance field development” In SIGGRAPH, 2023
- “F2-NeRF: Fast Neural Radiance Field Training with Free Camera Trajectories” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 4150–4159
- “Zip-NeRF: Anti-Aliased Grid-Based Neural Radiance Fields” In arXiv preprint arXiv:2304.06706, 2023
- “Instant neural graphics primitives with a multiresolution hash encoding” In ACM Transactions on Graphics (ToG) 41.4 ACM New York, NY, USA, 2022, pp. 1–15
- “Tensorf: Tensorial radiance fields” In Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXXII, 2022, pp. 333–350 Springer
- “K-planes: Explicit radiance fields in space, time, and appearance” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 12479–12488
- “Plenoxels: Radiance fields without neural networks” In arXiv preprint arXiv:2112.05131, 2021
- “HyperNeRF: A Higher-Dimensional Representation for Topologically Varying Neural Radiance Fields” In ACM Trans. Graph. 40.6 ACM, 2021
- “Dynibar: Neural dynamic image-based rendering” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 4273–4284
- “D-NeRF: Neural Radiance Fields for Dynamic Scenes” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020
- “iMAP: Implicit mapping and positioning in real-time” In Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 6229–6238
- “Nice-slam: Neural implicit scalable encoding for slam” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 12786–12796
- Daniil Lisus, Connor Holmes and Steven Waslander “Towards Open World NeRF-Based SLAM” In 2023 20th Conference on Robots and Vision (CRV), 2023, pp. 37–44 IEEE
- Antoni Rosinol, John J Leonard and Luca Carlone “Nerf-slam: Real-time dense monocular slam with neural radiance fields” In arXiv preprint arXiv:2210.13641, 2022
- “Orbeez-slam: A real-time monocular visual slam with orb features and nerf-realized mapping” In 2023 IEEE International Conference on Robotics and Automation (ICRA), 2023, pp. 9400–9406 IEEE
- “Instant Continual Learning of Neural Radiance Fields”, 2023 URL: https://api.semanticscholar.org/CorpusID:261531118
- “Continual neural mapping: Learning an implicit scene representation from sequential observations” In Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 15782–15792
- “CLNeRF: Continual Learning Meets NeRF”, 2023 arXiv:2308.14816 [cs.CV]
- “Robust Change Detection Based on Neural Descriptor Fields” In IROS, 2022
- “Robust dynamic radiance fields” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 13–23
- “Dynamon: Motion-aware fast and robust camera localization for dynamic nerf” In arXiv preprint arXiv:2309.08927, 2023
- “OR-NeRF: Object Removing from 3D Scenes Guided by Multiview Segmentation with Neural Radiance Fields” In arXiv preprint arXiv:2305.10503, 2023
- “A survey of research on cloud robotics and automation” In IEEE Transactions on Automation Science and Engineering 12.2, 2015, pp. 398–409
- “Fog Robotics Algorithms for Distributed Motion Planning Using Lambda Serverless Computing” In IEEE International Conference on Robotics and Automation (ICRA), 2020, pp. 4232–4238
- “A cloud robot system using the dexterity network and Berkeley robotics and automation as a service (BRASS)” In IEEE International Conference on Robotics and Automation (ICRA), 2017, pp. 1615–1622
- “Dex-Net as a service (DNaaS): A cloud-based robust robot grasp planning system” In IEEE Conference on Automation Science and Engineering (CASE), 2018, pp. 1420–1427
- “AWS IoT Greengrass” Accessed: 2021-02-15, https://aws.amazon.com/greengrass/
- “Rapyuta: A cloud robotics platform” In IEEE Transactions on Automation Science and Engineering 12.2, 2014, pp. 481–493
- “FogROS: An Adaptive Framework for Automating Fog Robotics Deployment” In 2021 IEEE 17th International Conference on Automation Science and Engineering (CASE), 2021, pp. 2035–2042 IEEE
- “FogROS2: An Adaptive and Extensible Platform for Cloud and Fog Robotics Using ROS 2” In arXiv preprint arXiv:2205.09778, 2022
- “FogROS G: Enabling Secure, Connected and Mobile Fog Robotics with Global Addressability” In arXiv preprint arXiv:2210.11691, 2022
- “BARF: Bundle-Adjusting Neural Radiance Fields” In IEEE International Conference on Computer Vision (ICCV), 2021
- Robert Tarjan “Depth-first search and linear graph algorithms” In SIAM journal on computing 1.2 SIAM, 1972, pp. 146–160