Toward Semantic Scene Understanding for Fine-Grained 3D Modeling of Plants (2312.17110v1)
Abstract: Agricultural robotics is an active research area due to global population growth and expectations of food and labor shortages. Robots can potentially help with tasks such as pruning, harvesting, phenotyping, and plant modeling. However, agricultural automation is hampered by the difficulty in creating high resolution 3D semantic maps in the field that would allow for safe manipulation and navigation. In this paper, we build toward solutions for this issue and showcase how the use of semantics and environmental priors can help in constructing accurate 3D maps for the target application of sorghum. Specifically, we 1) use sorghum seeds as semantic landmarks to build a visual Simultaneous Localization and Mapping (SLAM) system that enables us to map 78\% of a sorghum range on average, compared to 38% with ORB-SLAM2; and 2) use seeds as semantic features to improve 3D reconstruction of a full sorghum panicle from images taken by a robotic in-hand camera.
- StalkNet: A Deep Learning Pipeline for High-Throughput Measurement of Plant Stalk Count and Stalk Width. In Field and Service Robotics, 271–284. Springer.
- YOLOv4: Optimal Speed and Accuracy of Object Detection. CoRR, abs/2004.10934.
- Assignment Problems: revised reprint. SIAM.
- ORB-SLAM3: An Accurate Open-Source Library for Visual, Visual–Inertial, and Multimap SLAM. IEEE Transactions on Robotics.
- SLAM with object discovery, modeling and mapping. In 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems, 1018–1025. IEEE.
- Semantic mapping for orchard environments by merging two-sides reconstructions of tree rows. Journal of Field Robotics, 37(1): 97–121.
- LSD-SLAM: Large-scale direct monocular SLAM. In European conference on computer vision, 834–849. Springer.
- Mask R-CNN. In 2017 IEEE International Conference on Computer Vision (ICCV), 2980–2988.
- Hirschmuller, H. 2008. Stereo Processing by Semiglobal Matching and Mutual Information. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(2): 328–341.
- Image-to-Image Translation with Conditional Adversarial Networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, 1125–1134.
- Deeper Depth Prediction with Fully Convolutional Residual Networks. CoRR, abs/1606.00373.
- SSD: Single Shot MultiBox Detector. In European conference on computer vision, 21–37. Springer.
- Monocular Camera Based Fruit Counting and Mapping with Semantic Data Association. CoRR, abs/1811.01417.
- Lowe, D. G. 2004. Distinctive Image Features from Scale-Invariant Keypoints. Int. J. Comput. Vision, 60(2): 91–110.
- Fusion++: Volumetric Object-Level SLAM. In 2018 international conference on 3D vision (3DV), 32–41. IEEE.
- ORB-SLAM2: an Open-Source SLAM System for Monocular, Stereo and RGB-D Cameras. IEEE Transactions on Robotics, 33(5): 1255–1262.
- ROLS: Robust Object-level SLAM for Grape Counting. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops.
- Quadricslam: Dual Quadrics from Object Detections as Landmarks in Object-Oriented SLAM. IEEE Robotics and Automation Letters, 4(1): 1–8.
- Robust object-based slam for high-speed autonomous navigation. In 2019 International Conference on Robotics and Automation (ICRA), 669–675. IEEE.
- A Deep Learning-Based Stalk Grasping Pipeline. In 2018 IEEE International Conference on Robotics and Automation (ICRA), 1–5. IEEE.
- Qadri, M. 2021. Robotic Vision for 3D Modeling and Sizing in Agriculture. Master’s thesis, Carnegie Mellon University, Pittsburgh, PA.
- Semantic Feature Matching for Robust Mapping in Agriculture. arXiv preprint arXiv:2107.04178.
- Faster r-cnn: Towards real-time object detection with region proposal networks. In Advances in neural information processing systems, 91–99.
- ORB: An efficient alternative to SIFT or SURF. In Metaxas, D. N.; Quan, L.; Sanfeliu, A.; and Gool, L. V., eds., ICCV, 2564–2571. IEEE Computer Society. ISBN 978-1-4577-1101-5.
- Grape detection, segmentation, and tracking using deep neural networks and three-dimensional association. Computers and Electronics in Agriculture, 170: 105247.
- Robotic aubergine harvesting using dual-arm manipulation. IEEE Access, 8: 121889–121904.
- A Robust Illumination-Invariant Camera System for Agricultural Applications. ArXiv, abs/2101.02190.
- Robust plant phenotyping via model-based optimization. In 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 7689–7696. IEEE.
- Learning feature descriptors using camera pose supervision. In European Conference on Computer Vision, 757–774. Springer.
- Pyramid Scene Parsing Network. In Proceedings of the IEEE conference on computer vision and pattern recognition, 2881–2890.
- Assigning apples to individual trees in dense orchards using 3D colour point clouds. Biosystems Engineering, 209: 30–52.