Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
175 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

STAIR: Semantic-Targeted Active Implicit Reconstruction (2403.11233v1)

Published 17 Mar 2024 in cs.RO and cs.CV

Abstract: Many autonomous robotic applications require object-level understanding when deployed. Actively reconstructing objects of interest, i.e. objects with specific semantic meanings, is therefore relevant for a robot to perform downstream tasks in an initially unknown environment. In this work, we propose a novel framework for semantic-targeted active reconstruction using posed RGB-D measurements and 2D semantic labels as input. The key components of our framework are a semantic implicit neural representation and a compatible planning utility function based on semantic rendering and uncertainty estimation, enabling adaptive view planning to target objects of interest. Our planning approach achieves better reconstruction performance in terms of mesh and novel view rendering quality compared to implicit reconstruction baselines that do not consider semantics for view planning. Our framework further outperforms a state-of-the-art semantic-targeted active reconstruction pipeline based on explicit maps, justifying our choice of utilising implicit neural representations to tackle semantic-targeted active reconstruction problems.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (35)
  1. Y. Bhalgat, I. Laina, J. F. Henriques, A. Zisserman, and A. Vedaldi, “Contrastive Lift: 3D Object Instance Segmentation by Slow-Fast Contrastive Fusion,” in Proc. of the Conf. on Neural Information Processing Systems, 2023.
  2. A. K. Burusa, J. Scholten, D. R. Rincon, X. Wang, E. J. van Henten, and G. Kootstra, “Efficient Search and Detection of Relevant Plant Parts using Semantics-Aware Active Vision,” arXiv preprint arXiv:2306.09801, 2023.
  3. A. X. Chang, T. Funkhouser, L. Guibas, P. Hanrahan, Q. Huang, Z. Li, S. Savarese, M. Savva, S. Song, H. Su, J. Xiao, L. Yi, and F. Yu, “ShapeNet: An Information-Rich 3D Model Repository,” arXiv preprint arXiv:1512.03012, 2015.
  4. A. Chen, Z. Xu, A. Geiger, J. Yu, and H. Su, “TensoRF: Tensorial Radiance Fields,” in Proc. of the Europ. Conf. on Computer Vision, 2022.
  5. S. Chen, Y. Li, and N. M. Kwok, “Active Vision in Robotic Systems: A Survey of Recent Developments,” Intl. Journal of Robotics Research, vol. 30, no. 11, pp. 1343–1377, 2011.
  6. S. He, C. D. Hsu, D. Ong, Y. S. Shao, and P. Chaudhari, “Active Perception using Neural Radiance Fields,” arXiv preprint arXiv:2310.09892, 2023.
  7. J. V. Hurtado and A. Valada, “Semantic Scene Segmentation for Robotics,” in Deep Learning for Robot Perception and Cognition.   Academic Press, 2022, pp. 279–311.
  8. S. Isler, R. Sabzevari, J. Delmerico, and D. Scaramuzza, “An Information Gain Formulation for Active Volumetric 3D Reconstruction,” in Proc. of the IEEE Intl. Conf. on Robotics & Automation, 2016.
  9. A. Jain, M. Tancik, and P. Abbeel, “Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis,” in Proc. of the IEEE/CVF Intl. Conf. on Computer Vision, 2021.
  10. L. Jin, X. Chen, J. Rückin, and M. Popović, “NeU-NBV: Next Best View Planning Using Uncertainty Estimation in Image-Based Neural Rendering,” in Proc. of the IEEE/RSJ Intl. Conf. on Intelligent Robots and Systems, 2023.
  11. S. Kelly, A. Riccardi, E. Marks, F. Magistri, T. Guadagnino, M. Chli, and C. Stachniss, “Target-Aware Implicit Mapping for Agricultural Crop Inspection,” in Proc. of the IEEE Intl. Conf. on Robotics & Automation, 2023.
  12. S. Lee, L. Chen, J. Wang, A. Liniger, S. Kumar, and F. Yu, “Uncertainty Guided Policy for Active Robotic 3D Reconstruction using Neural Radiance Fields,” IEEE Robotics and Automation Letters, vol. 7, no. 4, pp. 12 070–12 077, 2022.
  13. C. Lehnert, D. Tsai, A. Eriksson, and C. McCool, “3D Move to See: Multi-Perspective Visual Servoing for Improving Object Vews with Semantic Segmentation,” in Proc. of the IEEE/RSJ Intl. Conf. on Intelligent Robots and Systems, 2019.
  14. L. Mescheder, M. Oechsle, M. Niemeyer, S. Nowozin, and A. Geiger, “Occupancy Networks: Learning 3D Reconstruction in Function Space,” in Proc. of the IEEE/CVF Conf. on Computer Vision and Pattern Recognition, 2019.
  15. B. Mildenhall, P. P. Srinivasan, M. Tancik, J. T. Barron, R. Ramamoorthi, and R. Ng, “NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis,” in Proc. of the Europ. Conf. on Computer Vision, 2020.
  16. T. Müller, A. Evans, C. Schied, and A. Keller, “Instant Neural Graphics Primitives with a Multiresolution Hash Encoding,” ACM Trans. on Graphics, vol. 41, no. 4, pp. 102:1–102:15, 2022.
  17. M. Naazare, F. G. Rosas, and D. Schulz, “Online Next-Best-View Planner for 3D-Exploration and Inspection with a Mobile Manipulator Robot,” IEEE Robotics and Automation Letters, vol. 7, no. 2, pp. 3779–3786, 2022.
  18. M. Oechsle, S. Peng, and A. Geiger, “UNISURF: Unifying Neural Implicit Surfaces and Radiance Fields for Multi-View Reconstruction,” in Proc. of the IEEE/CVF Intl. Conf. on Computer Vision, 2021.
  19. Open Robotics, “Gazebo.” [Online]. Available: https://gazebosim.org
  20. E. Palazzolo and C. Stachniss, “Effective Exploration for MAVs Based on the Expected Information Gain,” Drones, vol. 2, no. 1, pp. 59–66, 2018.
  21. S. Pan, L. Jin, H. Hu, M. Popović, and M. Bennewitz, “How Many Views Are Needed to Reconstruct an Unknown Object Using NeRF?” in Proc. of the IEEE Intl. Conf. on Robotics & Automation, 2024.
  22. X. Pan, Z. Lai, S. Song, and G. Huang, “ActiveNeRF: Learning Where to See with Uncertainty Estimation,” in Proc. of the Europ. Conf. on Computer Vision, 2022.
  23. S. Papatheodorou, N. Funk, D. Tzoumanikas, C. Choi, B. Xu, and S. Leutenegger, “Finding Things in the Unknown: Semantic Object-Centric Exploration with an MAV,” in Proc. of the IEEE Intl. Conf. on Robotics & Automation, 2023.
  24. J. J. Park, P. Florence, J. Straub, R. Newcombe, and S. Lovegrove, “DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation,” in Proc. of the IEEE/CVF Conf. on Computer Vision and Pattern Recognition, 2019.
  25. Y. Siddiqui, L. Porzi, S. R. Bulò, N. Müller, M. Nießner, A. Dai, and P. Kontschieder, “Panoptic Lifting for 3D Scene Understanding with Neural Fields,” in Proc. of the IEEE/CVF Conf. on Computer Vision and Pattern Recognition, 2023.
  26. C. Sun, M. Sun, and H. Chen, “Direct Voxel Grid Optimization: Super-fast Convergence for Radiance Fields Reconstruction,” in Proc. of the IEEE/CVF Conf. on Computer Vision and Pattern Recognition, 2022.
  27. N. Sünderhauf, J. Abou-Chakra, and D. Miller, “Density-aware NeRF Ensembles: Quantifying Predictive Uncertainty in Neural Radiance Fields,” in Proc. of the IEEE Intl. Conf. on Robotics & Automation, 2023.
  28. S. Vora, N. Radwan, K. Greff, H. Meyer, K. Genova, M. S. M. Sajjadi, E. Pot, A. Tagliasacchi, and D. Duckworth, “NeSF: Neural Semantic Fields for Generalizable Semantic Segmentation of 3D Scenes,” IEEE Trans. on Machine Learning Research, 2022.
  29. D. Yan, J. Liu, F. Quan, H. Chen, and M. Fu, “Active Implicit Object Reconstruction Using Uncertainty-Guided Next-Best-View Optimization,” IEEE Robotics and Automation Letters, vol. 8, no. 10, pp. 6395–6402, 2023.
  30. T. Zaenker, C. Smitt, C. McCool, and M. Bennewitz, “Viewpoint Planning for Fruit Size and Position Estimation,” in Proc. of the IEEE/RSJ Intl. Conf. on Intelligent Robots and Systems, 2021.
  31. H. Zhan, J. Zheng, Y. Xu, I. Reid, and H. Rezatofighi, “ActiveRMAP: Radiance Field for Active Mapping And Planning,” arXiv preprint arXiv:2211.12656, 2022.
  32. X. Zhang, D. Wang, S. Han, W. Li, B. Zhao, Z. Wang, X. Duan, C. Fang, X. Li, and J. He, “Affordance-Driven Next-Best-View Planning for Robotic Grasping,” in Proc. of the Conf. on Robot Learning, 2023.
  33. S. Zhi, T. Laidlow, S. Leutenegger, and A. J. Davison, “In-Place Scene Labelling and Understanding with Implicit Scene Representation,” in Proc. of the IEEE/CVF Intl. Conf. on Computer Vision, 2021.
  34. X. Zhong, Y. Pan, J. Behley, and C. Stachniss, “SHINE-Mapping: Large-Scale 3D Mapping Using Sparse Hierarchical Implicit Neural Representations,” in Proc. of the IEEE Intl. Conf. on Robotics & Automation, 2023.
  35. Z. Zhu, S. Peng, V. Larsson, W. Xu, H. Bao, Z. Cui, M. R. Oswald, and M. Pollefeys, “NICE-SLAM: Neural Implicit Scalable Encoding for SLAM,” in Proc. of the IEEE/CVF Conf. on Computer Vision and Pattern Recognition, 2022.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com