DAVIS-Ag: A Synthetic Plant Dataset for Prototyping Domain-Inspired Active Vision in Agricultural Robots (2303.05764v3)
Abstract: In agricultural environments, viewpoint planning can be a critical functionality for a robot with visual sensors to obtain informative observations of objects of interest (e.g., fruits) from complex structures of plant with random occlusions. Although recent studies on active vision have shown some potential for agricultural tasks, each model has been designed and validated on a unique environment that would not easily be replicated for benchmarking novel methods being developed later. In this paper, we introduce a dataset, so-called DAVIS-Ag, for promoting more extensive research on Domain-inspired Active VISion in Agriculture. To be specific, we leveraged our open-source "AgML" framework and 3D plant simulator of "Helios" to produce 502K RGB images from 30K densely sampled spatial locations in 632 synthetic orchards. Moreover, plant environments of strawberries, tomatoes, and grapes are considered at two different scales (i.e., Single-Plant and Multi-Plant). Useful labels are also provided for each image, including (1) bounding boxes and (2) instance segmentation masks for all identifiable fruits, and also (3) pointers to other images of the viewpoints that are reachable by an execution of action so as to simulate active viewpoint selections at each time step. Using DAVIS-Ag, we visualize motivating examples where fruit visibility can dramatically change depending on the pose of the camera view primarily due to occlusions by other components, such as leaves. Furthermore, we present several baseline models with experiment results for benchmarking in the task of target visibility maximization. Transferability to real strawberry environments is also investigated to demonstrate the feasibility of using the dataset for prototyping real-world solutions. For future research, our dataset is made publicly available online: https://github.com/ctyeong/DAVIS-Ag.
- G. Kootstra, X. Wang, P. M. Blok, J. Hemming, and E. Van Henten, “Selective harvesting robotics: current research, trends, and future directions,” Current Robotics Reports, vol. 2, pp. 95–104, 2021.
- T. Choi, O. Would, A. Salazar-Gomez, and G. Cielniak, “Self-supervised representation learning for reliable robotic monitoring of fruit anomalies,” in 2022 International Conference on Robotics and Automation (ICRA). IEEE, 2022, pp. 2266–2272.
- A. G. Olenskyj, B. S. Sams, Z. Fei, V. Singh, P. V. Raja, G. M. Bornhorst, and J. M. Earles, “End-to-end deep learning for directly estimating grape yield from ground-based imagery,” Computers and Electronics in Agriculture, vol. 198, p. 107081, 2022.
- R. Bajcsy, “Active perception,” Proceedings of the IEEE, vol. 76, no. 8, pp. 966–1005, 1988.
- C. Lehnert, D. Tsai, A. Eriksson, and C. McCool, “3D move to see: Multi-perspective visual servoing towards the next best view within unstructured and occluded environments,” in 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2019, pp. 3890–3897.
- R. van Essen, B. Harel, G. Kootstra, and Y. Edan, “Dynamic viewpoint selection for sweet pepper maturity classification using online economic decisions,” Applied Sciences, vol. 12, no. 9, p. 4414, 2022.
- T. Zaenker, C. Smitt, C. McCool, and M. Bennewitz, “Viewpoint planning for fruit size and position estimation,” in 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2021, pp. 3271–3277.
- A. K. Burusa, E. J. van Henten, and G. Kootstra, “Attention-driven active vision for efficient reconstruction of plants and targeted plant parts,” arXiv preprint arXiv:2206.10274, 2022.
- X. Zeng, T. Zaenker, and M. Bennewitz, “Deep reinforcement learning for next-best-view planning in agricultural applications,” in 2022 International Conference on Robotics and Automation (ICRA). IEEE, 2022, pp. 2323–2329.
- R. Menon, T. Zaenker, and M. Bennewitz, “Viewpoint planning based on shape completion for fruit mapping and reconstruction,” arXiv preprint arXiv:2209.15376, 2022.
- N. Häni, P. Roy, and V. Isler, “Minneapple: a benchmark dataset for apple detection and segmentation,” IEEE Robotics and Automation Letters, vol. 5, no. 2, pp. 852–858, 2020.
- J. Yang, Z. Ren, M. Xu, X. Chen, D. J. Crandall, D. Parikh, and D. Batra, “Embodied amodal recognition: Learning to move to perceive objects,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 2040–2050.
- E. Safronov, N. Piga, M. Colledanchise, and L. Natale, “Active perception for ambiguous objects classification,” in 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2021, pp. 4437–4444.
- F. Fang, W. Liang, Y. Wu, Q. Xu, and J.-H. Lim, “Self-supervised reinforcement learning for active object detection,” IEEE Robotics and Automation Letters, vol. 7, no. 4, pp. 10 224–10 231, 2022.
- R. Cheng, A. Agarwal, and K. Fragkiadaki, “Reinforcement learning of active vision for manipulating objects under occlusions,” in Conference on Robot Learning. PMLR, 2018, pp. 422–431.
- D. Nilsson, A. Pirinen, E. Gärtner, and C. Sminchisescu, “Embodied visual active learning for semantic segmentation,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, no. 3, 2021, pp. 2373–2383.
- J. Zeng, Y. Li, Y. Ran, S. Li, F. Gao, L. Li, S. He, Q. Ye et al., “Efficient view path planning for autonomous implicit reconstruction,” arXiv preprint arXiv:2209.13159, 2022.
- X. Ye, Z. Lin, H. Li, S. Zheng, and Y. Yang, “Active object perceiver: Recognition-guided policy learning for object searching on mobile robots,” in 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2018, pp. 6857–6863.
- J. F. Schmid, M. Lauri, and S. Frintrop, “Explore, approach, and terminate: Evaluating subtasks in active visual object search based on deep reinforcement learning,” in 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2019, pp. 5008–5013.
- T. Hodan, P. Haluza, Š. Obdržálek, J. Matas, M. Lourakis, and X. Zabulis, “T-less: An rgb-d dataset for 6d pose estimation of texture-less objects,” in 2017 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, 2017, pp. 880–888.
- A. Singh, J. Sha, K. S. Narayan, T. Achim, and P. Abbeel, “Bigbird: A large-scale 3d database of object instances,” in 2014 IEEE international conference on robotics and automation (ICRA). IEEE, 2014, pp. 509–516.
- J. Yang, Y. Gao, D. Li, and S. L. Waslander, “ROBI: A multi-view dataset for reflective objects in robotic bin-picking,” in 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2021, pp. 9788–9795.
- P. Ammirato, P. Poirson, E. Park, J. Košecká, and A. C. Berg, “A dataset for developing and benchmarking active vision,” in 2017 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2017, pp. 1378–1385.
- Q. Zhao, L. Zhang, L. Wu, H. Qiao, and Z. Liu, “A real 3D embodied dataset for robotic active visual learning,” IEEE Robotics and Automation Letters, vol. 7, no. 3, pp. 6646–6652, 2022.
- X. Puig, E. Undersander, A. Szot, M. D. Cote, R. Partsey, J. Yang, R. Desai, A. W. Clegg, M. Hlavac, T. Min, T. Gervet, V. Vondrus, V.-P. Berges, J. Turner, O. Maksymets, Z. Kira, M. Kalakrishnan, J. Malik, D. S. Chaplot, U. Jain, D. Batra, A. Rai, and R. Mottaghi, “Habitat 3.0: A co-habitat for humans, avatars and robots,” arXiv, 2023.
- J. A. Gibbs, M. P. Pound, A. P. French, D. M. Wells, E. H. Murchie, and T. P. Pridmore, “Active vision and surface reconstruction for 3D plant shoot modelling,” IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 17, no. 6, pp. 1907–1917, 2019.
- E. Rohmer, S. P. Singh, and M. Freese, “V-rep: A versatile and scalable robot simulation framework,” in 2013 IEEE/RSJ international conference on intelligent robots and systems. IEEE, 2013, pp. 1321–1326.
- M. Goslin and M. R. Mine, “The panda3D graphics engine,” Computer, vol. 37, no. 10, pp. 112–114, 2004.
- M. Quigley, K. Conley, B. Gerkey, J. Faust, T. Foote, J. Leibs, R. Wheeler, A. Y. Ng et al., “Ros: an open-source robot operating system,” in ICRA workshop on open source software, vol. 3, no. 3.2. Kobe, Japan, 2009, p. 5.
- N. Koenig and A. Howard, “Design and use paradigms for gazebo, an open-source multi-robot simulator,” in 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)(IEEE Cat. No. 04CH37566), vol. 3. IEEE, 2004, pp. 2149–2154.
- N. Chebrolu, P. Lottes, A. Schaefer, W. Winterhalter, W. Burgard, and C. Stachniss, “Agricultural robot dataset for plant classification, localization and mapping on sugar beet fields,” The International Journal of Robotics Research, vol. 36, no. 10, pp. 1045–1052, 2017.
- J. Weyler, F. Magistri, E. Marks, Y. L. Chong, M. Sodano, G. Roggiolani, N. Chebrolu, C. Stachniss, and J. Behley, “Phenobench–a large dataset and benchmarks for semantic image interpretation in the agricultural domain,” arXiv preprint arXiv:2306.04557, 2023.
- M. Marzoa Tanco, G. Trinidad Barnech, F. Andrade, J. Baliosian, M. LLofriu, J. Di Martino, and G. Tejera, “Magro dataset: A dataset for simultaneous localization and mapping in agricultural environments,” The International Journal of Robotics Research, p. 02783649231210011, 2023.
- R. Polvara, S. Molina, I. Hroob, A. Papadimitriou, K. Tsiolis, D. Giakoumis, S. Likothanassis, D. Tzovaras, G. Cielniak, and M. Hanheide, “Bacchus long-term (blt) data set: Acquisition of the agricultural multimodal blt data set with automated robot deployment,” Journal of Field Robotics, 2023.
- D. Guevara, A. Joshi, P. Raja, E. Forrestel, B. Bailey, and M. Earles, “An open source simulation toolbox for annotation of images and point clouds in agricultural scenarios,” in International Symposium on Visual Computing. Springer, 2023, pp. 557–570.
- B. N. Bailey, “Helios: A scalable 3D plant and environmental biophysical modeling framework,” Frontiers in Plant Science, vol. 10, p. 1185, 2019.
- J. Weber and J. Penn, “Creation and rendering of realistic trees,” in Proceedings of the 22nd annual conference on Computer graphics and interactive techniques, 1995, pp. 119–128.
- R. Kirk, M. Mangan, and G. Cielniak, “Robust counting of soft fruit through occlusions with re-identification,” in Computer Vision Systems: 13th International Conference, ICVS 2021, Virtual Event, September 22-24, 2021, Proceedings 13. Springer, 2021, pp. 211–222.
- T. Zaenker, C. Lehnert, C. McCool, and M. Bennewitz, “Combining local and global viewpoint planning for fruit coverage,” in 2021 European Conference on Mobile Robots (ECMR). IEEE, 2021, pp. 1–7.
- R. Polvara, S. M. Mellado, I. Hroob, G. Cielniak, and M. Hanheide, “Collection and evaluation of a long-term 4d agri-robotic dataset,” arXiv preprint arXiv:2211.14013, 2022.
- V. Mnih, A. P. Badia, M. Mirza, A. Graves, T. Lillicrap, T. Harley, D. Silver, and K. Kavukcuoglu, “Asynchronous methods for deep reinforcement learning,” in International conference on machine learning. PMLR, 2016, pp. 1928–1937.
- M. Lemsalu, V. Bloch, J. Backman, and M. Pastell, “Real-time cnn-based computer vision system for open-field strawberry harvesting robot,” IFAC-PapersOnLine, vol. 55, no. 32, pp. 24–29, 2022.
- J. Shermeyer, T. Hossler, A. Van Etten, D. Hogan, R. Lewis, and D. Kim, “Rareplanes: Synthetic data takes flight,” in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2021, pp. 207–217.
- G. Jocher, “Yolov5 by ultralytics,” 2020. [Online]. Available: https://github.com/ultralytics/yolov5
- F. Magistri, J. Weyler, D. Gogoll, P. Lottes, J. Behley, N. Petrinic, and C. Stachniss, “From one field to another—unsupervised domain adaptation for semantic segmentation in agricultural robotics,” Computers and Electronics in Agriculture, vol. 212, p. 108114, 2023.