TomatoDIFF: On-plant Tomato Segmentation with Denoising Diffusion Models (2307.01064v1)
Abstract: Artificial intelligence applications enable farmers to optimize crop growth and production while reducing costs and environmental impact. Computer vision-based algorithms in particular, are commonly used for fruit segmentation, enabling in-depth analysis of the harvest quality and accurate yield estimation. In this paper, we propose TomatoDIFF, a novel diffusion-based model for semantic segmentation of on-plant tomatoes. When evaluated against other competitive methods, our model demonstrates state-of-the-art (SOTA) performance, even in challenging environments with highly occluded fruits. Additionally, we introduce Tomatopia, a new, large and challenging dataset of greenhouse tomatoes. The dataset comprises high-resolution RGB-D images and pixel-level annotations of the fruits.
- Tomato Dataset. https://www.kaggle.com/datasets/andrewmvd/tomato-detection, 2020.
- SegDiff: Image Segmentation with Diffusion Probabilistic Models. arXiv:2112.00390, 2022.
- Evaluating the Single-Shot MultiBox Detector and YOLO Deep Learning Models for the Detection of Tomatoes in a Greenhouse. Sensors, 2021.
- YOLACT: Real-Time Instance Segmentation. In International Conference on Computer Vision (ICCV), 2019.
- Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation. In European Conference on Computer Vision (ECCV), 2018.
- Diffusion Models in Vision: A Survey. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), pages 1–20, 2023.
- Development of Monitoring Robot System for Tomato Fruits in Hydroponic Greenhouses. Agronomy, 2021.
- Deep Learning Based Computer Vision Approaches for Smart Agricultural Applications. Artificial Intelligence in Agriculture, 6:211–229, 2022.
- Application of Consumer RGB-D Cameras for Fruit Detection and Localization in Field: A Critical Review. Computers and Electronics in Agriculture, 177:105687, 2020.
- Benchmark of Deep Learning and a Proposed HSV Colour Space Models for the Detection and Classification of Greenhouse Tomato. Agronomy, 2022.
- Mask R-CNN. In International Conference on Computer Vision (ICCV), pages 2980–2988, 2017.
- Denoising Diffusion Probabilistic Models. In Advances in Neural Information Processing Systems (NIPS), 2020.
- Automatic Phenotyping of Tomatoes in Production Greenhouses Using Robotics and Computer Vision: From Theory to Practice. Agronomy, 2021.
- L. Inc. LaboroTomato, 2020.
- Elucidating the Design Space of Diffusion-Based Generative Models. In Advances in Neural Information Processing Systems (NIPS), 2022.
- A CNN-RNN Framework for Crop Yield Prediction. Frontiers in Plant Science, 10, 2020.
- PointRend: Image Segmentation As Rendering. In Conference on Computer Vision and Pattern Recognition (CVPR), 2020.
- Detection and Segmentation of Mature Green Tomatoes Based on Mask R-CNN with Automatic Image Acquisition Approach. Sensors, 2021.
- T. G. Lins and C. Wouter. Automatic Visual Estimation of Tomato Cluster Maturity in Plant Rows. Machine Vision and Applications, 2021.
- Fully Convolutional Networks for Semantic Segmentation. In Conference on Computer Vision and Pattern Recognition (CVPR), 2015.
- I. Loshchilov and F. Hutter. Decoupled Weight Decay Regularization. In International Conference on Learning Representations (ICLR), 2019.
- DPM-Solver: A Fast ODE Solver for Diffusion Probabilistic Model Sampling in Around 10 Steps. In Advances in Neural Information Processing Systems (NIPS), volume 35, pages 5775–5787, 2022.
- Tomato Segmentation and Localization Method Based on RGB-D Camera. In International Agricultural Engineering Journal, 2019.
- Tomato Fruit Detection and Counting in Greenhouses Using Deep Learning. Frontiers in Plant Science, 2020.
- Deep Learning-Based Segmentation and Classification of Leaf Images for Detection of Tomato Plant Disease. In Frontiers in Plant Science, volume 13, 2022.
- T2V-DDPM: Thermal to Visible Face Translation using Denoising Diffusion Probabilistic Models. In IEEE International Conference on Automatic Face and Gesture Recognition (FG), pages 1–7, 2023.
- M. O. Ojo and A. Zahid. Deep Learning in Controlled Environment Agriculture: A Review of Recent Advancements, Challenges and Prospects. Sensors, 22(20), 2022.
- U. F. Rahim and M. Hiroshi. Highly Accurate Tomato Maturity Recognition: Combining Deep Instance Segmentation, Data Synthesis and Color Analysis. In Artificial Intelligence and Cloud Computing Conference (AICCC), 2021.
- Hierarchical Text-Conditional Image Generation with CLIP Latents. arXiv:2204.06125, 2022.
- U-Net: Convolutional Networks for Biomedical Image Segmentation. In Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015, pages 234–241, 2015.
- Palette: Image-to-Image Diffusion Models. In ACM SIGGRAPH, 2022.
- PixelCNN++: Improving the PixelCNN with Discretized Logistic Mixture Likelihood and Other Modifications. In International Conference on Learning Representations (ICLR), 2017.
- Machine Vision Techniques for Soil Detection in Agricultural Applications. In National Conference on Emerging Trends in Information, Management and Engineering Sciences (NCETIMES), 2018.
- Computer Vision Technology in Agricultural Automation — A Review. Information Processing in Agriculture, 7(1):1–19, 2020.
- Robust Cherry Tomatoes Detection Algorithm in Greenhouse Scene Based on SSD. Agriculture, 2020.
- Segmentation and Size Estimation of Tomatoes from Sequences of Paired Images. In EURASIP Journal of Image and Video Processing, 2014.
- Shape-based Segmentation of Tomatoes for Agriculture Monitoring. In International Conference on Pattern Recognition Applications and Methods (ICPRAM), 2014.
- SOLOv2: Dynamic and Fast Instance Segmentation. In Advances in Neural Information Processing Systems (NIPS), volume 33, pages 17721–17732, 2020.
- The Swiss Army Knife for Image-to-Image Translation: Multi-Task Diffusion Models. arXiv:2204.02641, 2022.
- MedSegDiff-V2: Diffusion based Medical Image Segmentation with Transformer. arXiv:2301.11798, 2023.
- MedSegDiff: Medical Image Segmentation with Diffusion Probabilistic Model. In Medical Imaging with Deep Learning (MIDL), 2023.
- SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers. In Advances in Neural Information Processing Systems (NIPS), volume 34, pages 12077–12090, 2021.
- Visual Recognition of Cherry Tomatoes in Plant Factory Based on Improved Deep Instance Segmentation. Computers and Electronics in Agriculture, 2022.
- BiSeNet V2: Bilateral Network with Guided Aggregation for Real-Time Semantic Segmentation. In International Journal of Computer Vision, volume 129, pages 3051–3068, 2021.
- Intact Detection of Highly Occluded Immature Tomatoes on Plants Using Deep Learning Techniques. Sensors, 2021.
- S. Zagoruyko and N. Komodakis. Wide Residual Networks. In Proceedings of the British Machine Vision Conference (BMVC), pages 87.1–87.12, 2016.
- Pyramid Scene Parsing Network. In Conference on Computer Vision and Pattern Recognition (CVPR), pages 6230–6239, 2017.