Daylily-Leaf Dataset for Foliar Disease Detection
- Daylily-Leaf dataset is a lesion-level annotated image collection designed for foliar disease detection with dual acquisition protocols (laboratory and in-field).
- Annotations follow a tripartite schema (Rust, Others, MidLate) set by plant-pathology experts to ensure precise bounding-box generation and quality control.
- Established benchmarking protocols using metrics like mAP@50 facilitate robust evaluation and comparison of detection models such as YOLO and TCLeaf-Net.
The Daylily-Leaf dataset is a lesion-level annotated image dataset curated for the task of foliar disease detection in daylily plants, targeting fine-grained, object-level model evaluation under both laboratory-controlled ("ideal") and real-world ("in-field") acquisition scenarios. Its architectural design, annotation rigor, and clearly defined train/val splits optimize its utility for benchmarking and development of robust plant disease detectors, especially in contexts afflicted by cluttered backgrounds, occlusions, and domain shift (Song et al., 13 Dec 2025).
1. Dataset Structure and Acquisition Protocol
Daylily-Leaf comprises 1,746 RGB images with 7,839 meticulously annotated lesions, distributed between two principal subsets:
- Ideal subset: 813 images, 5,172 lesions; laboratory-acquired against white backgrounds.
- In-field subset: 933 images, 2,667 lesions; captured under natural lighting on daylily cultivation plots, with complex backgrounds and varying scales.
Images were sourced at ~12 megapixel resolution (≥4000Ă—3000 px), then partitioned into overlapping crops (typically 640Ă—640 px) to maintain balanced lesion density per image for annotation and model training. All processed images were stored as JPEGs and subsequently resized to 640Ă—640 px for network input.
2. Annotation Schema, Categories, and Workflow
Annotations were created using LabelImg by plant-pathology domain experts:
- Bounding-box generation: Tight axis-aligned rectangles around all visible lesions.
- Quality control: Secondary annotator review for case ambiguity.
Annotations were formatted in PASCAL VOC XML (1 file/image), including the following fields:
- image_id or filename
- object {name: ["Rust", "Others", "MidLate"], xmin, ymin, xmax, ymax}
Three disease categories structure the object taxonomy:
- Rust ("Rust"): Early rust pustules, spot-like lesions.
- Others ("Others"): Miscellaneous small spots, minor necroses, insect feeding.
- Mid–Late ("MidLate"): Powdery mildew, and mid-to-late coalesced disease spots.
This tripartite schema balances biological meaningfulness with the class imbalance inherent to foliar lesion occurrences.
3. Organization, Splits, and Density
The directory and split structure is explicit and reproducible:
1 2 3 4 5 6 7 8 9 10 11 |
dataset_root/
├── ideal/
│ ├── train/
│ │ ├── images/
│ │ └── annotations/
│ └── val/
│ ├── images/
│ └── annotations/
└── infield/
├── train/images/, annotations/
└── val/images/, annotations/ |
Each subset (ideal/in-field) is split approximately 70/30 by count into train and val (validation). Images are strictly partitioned; no overlap across splits. Class stratification maintains Rust:Others:MidLate approximate ratios in all splits. Lesion density varies: ideal (mean ≈6.36 lesions/image), in-field (mean ≈2.86 lesions/image), overall mean ≈4.49 lesions/image.
| Subset | Split | #Images | #Lesions | Rust | Others | Mid–Late |
|---|---|---|---|---|---|---|
| Ideal | Train | 569 | 3,877 | 2,229 | 1,228 | 420 |
| Val | 244 | 1,295 | 791 | 375 | 129 | |
| In-Field | Train | 653 | 1,788 | 1,169 | 552 | 67 |
| Val | 280 | 879 | 469 | 374 | 36 | |
| Total | 1,746 | 7,839 | 4,658 | 2,529 | 652 |
4. Statistical Properties and Computation
For dataset quantification, lesion count mean () and variance () are defined as:
where is the image count in a split and is its lesion count. Researchers can compute these using provided Python pseudocode, utilizing XML parsing and numpy statistical functions.
Class distribution for total lesion objects: | Class | Ideal | In-Field | Total | |----------|:-----:|:--------:|:-----:| | Rust | 3,020 | 1,638 | 4,658 | | Others | 1,603 | 926 | 2,529 | | Mid-Late | 549 | 103 | 652 |
5. Evaluation Metrics and Model-Building Protocol
Recommended evaluation protocol utilizes standard object-detection measures:
- Precision, Recall, F1:
- Average Precision (AP) at IoU :
- mean AP@50 ():
- mean AP@[50:95]: average of for .
Best-practice for training includes:
- Input size 640Ă—640 px, RGB normalization ([0,1] or ImageNet mean/std).
- Augmentations: flips, rotations, brightness/contrast/HSV jitter, simulated rain/snow, mosaic mixing (up to 4 images).
- SGD optimizer: lr=0.001, momentum=0.937, weight decay=5e-4.
- ~200 epochs, batch size 16, single GPU.
- Non-Maximum Suppression (NMS) at inference, IoU thresh=0.45.
Split usage is defined: "train" for fitting, "val" for hyperparameter selection; further subsampling from "val" for held-out test is permissible.
6. Sample Annotation Formats and Data Loading
CSV annotation row (exported from XML):
1 2 3 |
image_id, xmin, ymin, xmax, ymax, class daylily_0123.jpg, 123, 45, 200, 180, Rust daylily_0123.jpg, 340, 90, 380, 140, Others |
Example: loading and visualizing bounding boxes in Python using PIL and xml.etree.ElementTree:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
import xml.etree.ElementTree as ET from PIL import Image, ImageDraw def load_voc_annotations(xml_path): tree = ET.parse(xml_path) root = tree.getroot() boxes, labels = [], [] for obj in root.findall("object"): cls = obj.find("name").text b = obj.find("bndbox") xmin, ymin = int(b.find("xmin").text), int(b.find("ymin").text) xmax, ymax = int(b.find("xmax").text), int(b.find("ymax").text) boxes.append((xmin, ymin, xmax, ymax)) labels.append(cls) return boxes, labels img = Image.open("dataset_root/ideal/train/images/daylily_0123.jpg") boxes, labels = load_voc_annotations("…/annotations/daylily_0123.xml") draw = ImageDraw.Draw(img) for (xmin, ymin, xmax, ymax), cls in zip(boxes, labels): draw.rectangle([xmin, ymin, xmax, ymax], outline="red", width=2) draw.text((xmin, ymin-10), cls, fill="yellow") img.show() |
Researchers can compute mean and variance of lesions per image as follows:
1 2 3 4 5 6 7 |
import glob, xml.etree.ElementTree as ET, numpy as np counts = [] for xml_file in glob.glob("*/annots/*.xml"): root = ET.parse(xml_file).getroot() counts.append(len(root.findall("object"))) mu, var = np.mean(counts), np.var(counts) print(f"Mean per image: {mu:.2f}, variance: {var:.2f}") |
7. Benchmarking Relevance and Implications
Daylily-Leaf is deployable as a benchmark for evaluating and training fine-grained, lesion-level object detectors in plant disease contexts characterized by background clutter and real-world image variation. It addresses typical confounders in agricultural vision, facilitating direct comparison of methods such as YOLO, RT-DETR, and next-generation hybrid architectures (e.g., TCLeaf-Net), especially for scenarios requiring reconciliation of global-to-local context and computational efficiency. Experimental evidence demonstrates that TCLeaf-Net, benchmarked on the Daylily-Leaf in-field split, attains mAP@50 of 78.2%, exceeding baseline models by 5.4 percentage points while reducing computation by 7.5 GFLOPs and GPU memory by 8.7% (Song et al., 13 Dec 2025).
A plausible implication is that this dataset, due to its dual condition design and annotation precision, promotes methodological generalizability and robustness in lesion-level plant disease detection and can be immediately deployed for transfer learning validation on related datasets such as PlantDoc, Tomato-Leaf, and Rice-Leaf.
In summary, Daylily-Leaf provides a compact, scientifically curated resource for advancing foliar disease detection research in both laboratory and field environments, with an emphasis on lesion-level discrimination, annotation rigor, and practical deployment protocols.