Advanced Plant Diseases Dataset

Updated 24 February 2026

The new plant diseases dataset is a rigorously curated collection of images and multimodal data that supports detection, segmentation, and QA tasks using expert annotations.
It offers extensive species and class diversity across both lab-controlled and in-the-wild settings with detailed modalities like segmentation masks, bounding boxes, and text prompts.
The dataset underpins robust benchmarking for deep learning and vision-language models through standardized splits and evaluation metrics, advancing agricultural AI research.

A new plant diseases dataset, in contemporary research, refers to any rigorously gathered, curated, and annotated collection of images or multimodal data (such as paired text descriptions) specifically designed to support computational methods for detection, diagnosis, segmentation, or retrieval of plant diseases. Such datasets address critical bottlenecks in agricultural AI: enabling domain-adapted training, providing evaluation benchmarks for vision and vision-LLMs, and reflecting the visual and taxonomic diversity of both lab-controlled and in-the-wild phenotypes. Recent advances include datasets that scale in breadth (species, pathogen, region), annotation fidelity (pixel-level segmentation, bounding boxes), and modality (image, text, metadata, QA-pairs), directly supporting both traditional deep learning and foundation-model-powered approaches.

1. Dataset Scope and Taxonomic Breadth

Recent plant disease datasets prioritize scale and diversity—from canonical lab-captured collections to comprehensive in-the-wild and multimodal archives.

Image Scale and Diversity: Contemporary datasets range from thousands (e.g., PiW: 1,980 images (Nuthalapati et al., 2021); PlantDoc: 2,598 images (Singh et al., 2019)) up to 186,000 images (LeafNet (Quoc et al., 14 Feb 2026)) and 178,922 (FloraSyntropy Archive (Khan et al., 25 Aug 2025)).
Species and Class Coverage: The largest sets (FloraSyntropy, LeafNet) cover up to 35 species and 97 distinct disease/health classes, including both common crops (maize, rice, tomato, apple) and economically less-documented species. Standard datasets such as PlantVillage (Hughes et al., 2015) include 38–54 classes, with others capturing multi-label, co-infection, and complex pattern phenotypes (Thapa et al., 2020).
Environmental Breadth: PlantWild (Wei et al., 2024) and PlantSeg (Wei et al., 2024) specifically address uncontrolled conditions, spanning lighting, occlusion, growth stage, and image quality, which expose models to real deployment scenarios.
Modalities: Modern archives provide not only RGB images but:
- Paired disease symptom descriptions (PlantWild (Wei et al., 2024), Snap and Diagnose (Wei et al., 2024))
- Segmentation masks (PlantSeg (Wei et al., 2024), LDD (Rossi et al., 2022))
- Object bounding boxes for lesion- or organ-level disease localization (Katumba et al., 2020)
- Structured metadata including species, disease agent, collection environment, and text-based question-answer pairs (LeafNet (Quoc et al., 14 Feb 2026))

Representative Dataset Properties

Dataset Name	Images	Species	Classes	Modality	In-the-wild	Segmentation	Multimodal
PlantVillage	54,309	14	38	RGB, CSV labels	No	No	No
LeafNet	186,000	22	97	RGB, QA, metadata	Yes	No	Yes
FloraSyntropy	178,922	35	97	RGB, metadata	Partly	No	No
PlantWild	18,542	89	89	RGB, text prompts	Yes	No	Yes
PlantSeg	19,400	34	115	RGB, segmentation masks	Yes	Yes	No
LDD	1,092	1	10	RGB, polygons/boxes	Yes	Yes	No

2. Data Acquisition and Annotation Protocols

Acquisition strategies reflect the dataset’s intended domain adaptation and support for robust learning.

Source Ecology: Images are collected:
- In-field: farm visits (e.g., Uganda, US, China), crowd-sourced field images, web scraping (PlantWild, PlantSeg, PlantDoc).
- Lab-controlled: detached leaves under standardized backgrounds (PlantVillage (Hughes et al., 2015), part of LeafNet (Quoc et al., 14 Feb 2026)).
Image Preprocessing: Uniform resizing (224×224 or 400×400 px) and cleaning to standardize input for deep architectures (FloraSyntropy, PlantPath, LeafNet).
Labeling Procedures:
- Expert Annotation: All datasets with pathology intent employ expert plant pathologists at labeling/QA steps (e.g., PlantPathology 2020 (Thapa et al., 2020), PlantWild (Wei et al., 2024), LDD (Rossi et al., 2022)).
- Segmentation/Instance Masking: Polygonal and pixel-wise masks generated via LabelMe or Label Studio, with dual pass annotation/review (PlantSeg, LDD).
- Object Detection: Lesion-level bounding boxes (e.g., passion fruit dataset (Katumba et al., 2020)), organ/cluster bounding for grape diseases (LDD), following minimum lesion size guidelines.
- Textual Description Generation: Text prompts and QA-pairs generated by expert curation and/or LLM prompting, with multi-phase validation (PlantWild, LeafNet).

Training for polygon-standard annotation (qualification by expert pathologists)
Annotation pass one (10 annotators)
Expert review and correction
Pathologist signoff

3. Structure, Splits, and Statistical Characteristics

Rigorous split and balancing strategies underpin reproducibility and fair evaluation.

Typical Splits:
- Training/Validation/Test: PlantWild uses 70/10/20 splits, FloraSyntropy adopts 70/10/20, PlantPathology 2020 employs 80/20, LDD uses 80/20 (Khan et al., 25 Aug 2025, Wei et al., 2024, Thapa et al., 2020, Rossi et al., 2022).
- Few-shot-specific splits (PiW, PlantWild) employ meta-train/meta-test class disjointness and episodic N-way M-shot sampling for support/query partitioning (Nuthalapati et al., 2021, Wei et al., 2024).
- Multi-class balancing is enforced by subsampling or over/under-sampling to minimum class thresholds (FloraSyntropy, PlantSeg).
Statistical Class Metrics:
- Class imbalance ratios, per-class instance counts, and per-class mean area or mask properties are typically computed and reported (Rossi et al., 2022, Wei et al., 2024).
- Macro-averaged precision, recall, and F1-score are standard evaluation metrics for multi-class datasets (Khan et al., 25 Aug 2025, Quoc et al., 14 Feb 2026).

Split	Images per class	Description
Train	≥4712	All 97 classes balanced post-augmentation
Valid	≥524	Stratified by class
Test	≈35784 total	Stratified 20 % hold-out across all classes

4. Benchmarking, Model Architectures, and Evaluation

Dataset construction is tightly coupled to benchmarking state-of-the-art architectures for classification, segmentation, retrieval, or QA.

Baseline Deep Networks: Models fine-tuned on these datasets include ResNet variants, EfficientDet (detection; (Katumba et al., 2020)), DenseNet (FloraSyntropy), VGG/Inception (PlantDoc), and Transformer backbones (SAN/ViT in PlantSeg).
Foundation and Vision–LLMs:
- CLIP-style vision-language encoders underpin PlantWild retrieval and MVPDR classification (Wei et al., 2024, Wei et al., 2024).
- LeafBench evaluates both image-only and VLMs on QA (SCOLD, CLIP, GPT-4o, Gemini 2.5 Pro (Quoc et al., 14 Feb 2026)).
Specialized Few-Shot Protocols: Episodic training and Mahalanobis-covariance-based metrics (for example, in PiW (Nuthalapati et al., 2021)) enable low-data regime benchmarking.
Evaluation Metrics:
- Classification: accuracy, macro-F1, class-confusion.
- Segmentation: mean intersection-over-union (MIoU), mAcc, per-class dice coefficient (Wei et al., 2024, Rossi et al., 2022).
- Detection: mean average precision (mAP) at IoU thresholds (Rossi et al., 2022).
- Retrieval: Top-K accuracy (P@K), mAP (Wei et al., 2024).
- QA: task-specific accuracy on closed-form multiple-choice LeafBench prompts (Quoc et al., 14 Feb 2026).

Benchmark Performance (Selected Key Results)

Dataset	Task	Model	Acc / mAP / MIoU	Reference
PlantPath2020	Classification	ResNet50	97 % acc	(Thapa et al., 2020)
LeafNet	Healthy-diseased	CLIP	97.5 % acc	(Quoc et al., 14 Feb 2026)
FloraSyntropy	Classification	FloraSyntropy-Net	96.38 % acc	(Khan et al., 25 Aug 2025)
PlantSeg	Segmentation	SegNeXt (MSCAN-L)	44.52 % MIoU, 59.95 % mAcc	(Wei et al., 2024)
LDD	Inst. Segm.	R³-CNN (Box/Mask AP)	22.7 / 22.2	(Rossi et al., 2022)
PlantWild	Retrieval	Snap’n Diagnose	67.32 (Top-1), 79.34 mAP	(Wei et al., 2024)

5. Use Cases, Limitations, and Future Directions

New datasets directly enable both core research and applied tools in plant pathology, but key gaps and research challenges remain.

Use Cases:
- Large-scale benchmarks for classical and deep learning architectures
- Evaluation and deployment of vision-LLMs and few-shot/zero-shot classifiers
- Precision agriculture: segmentation for disease severity estimation, decision support for fungicide/pesticide applications, real-time smartphone or drone-based scouting
- Domain adaptation studies, cross-dataset benchmarking, and transfer learning analyses (Thapa et al., 2020, Wei et al., 2024, Khan et al., 25 Aug 2025)
Limitations and Open Issues:
- Geographic and phenological diversity is still limited in many archives (LeafNet: 7 countries, but global expansion is needed) (Quoc et al., 14 Feb 2026).
- Lack of temporal progression sequences, multi-label co-infection annotation, or detailed severity gradation in most sets (a notable exception is the provided 3-point severity in PiW (Nuthalapati et al., 2021)).
- Absence of metadata such as growth stage, GPS, or multi-spectral modalities in most datasets.
- Licensing and access conditions still vary; not all are fully open-access at publication (Katumba et al., 2020, Khan et al., 25 Aug 2025, Wei et al., 2024).
Proposed Directions:
- Enrichment with temporal and contextual metadata, structured severity scoring, and multi-label co-infection annotation
- Synthetic augmentation for rare disease instantiation and expansion to multispectral/temporal datasets
- Versioning and community-led label-quality improvement and challenge-leaderboards
- Development of QA and visual reasoning benchmarks such as LeafBench to bridge the gap to robust, trustworthy diagnostic tools (Quoc et al., 14 Feb 2026)

6. Comparative Analysis and Significance for the Field

The evolution of plant disease datasets from single-crop, lab-controlled images to complex, multimodal, and in-the-wild benchmarks has shifted standards for designing, evaluating, and deploying agricultural AI.

Datasets such as PlantWild, PlantSeg, and LeafNet now support (1) training and testing of domain-adapted deep nets, (2) rigorous benchmarking of few-shot and vision-language methods, and (3) critical evaluation of real-world performance gaps (e.g., fine-grained disease classification <65 % even with large VLMs (Quoc et al., 14 Feb 2026)).
A plausible implication is that robust, generalizable, and deployment-ready plant disease detection will increasingly depend on both expanded dataset scale and annotation depth, including explicit multi-modal and open-set QA contexts.
Ongoing integration of expert-driven curation, open-source licensing, and community challenge-based model development will likely accelerate translation of these benchmarks into real-world, farmer-oriented diagnostic applications and decision-support frameworks.

The emerging generation of plant diseases datasets thus forms the substrate for methodological progress in both core machine vision and agricultural AI, addressing the challenges of generalization, robustness, and multimodal understanding required for practical, scalable crop health management.