Papers
Topics
Authors
Recent
Search
2000 character limit reached

StomataSeg: High-Res Stomatal Segmentation Data

Updated 15 March 2026
  • StomataSeg is a large-scale dataset providing high-resolution microscopy images of sorghum leaves with detailed annotations on stomatal components.
  • The dataset combines rigorous human annotation with semi-supervised Mask R-CNN pseudo-labeling, yielding 67,488 unique patch images for scalable phenotyping.
  • It supports both instance and semantic segmentation with improved mIoU and AP metrics, facilitating robust automated stomatal trait extraction.

StomataSeg is a curated, large-scale dataset for the instance and semantic segmentation of stomatal components in sorghum leaf microscopy images. Developed to address the bottlenecks in automated stomatal phenotyping—primarily annotation of minute, hierarchically nested plant structures—StomataSeg combines rigorous human annotation with semi-supervised pseudo-labelling to enable scalable extraction of stomatal traits relevant for crop science and physiology (Huang et al., 31 Jan 2026).

1. Composition and Image Acquisition

The dataset comprises 318 high-resolution microscopy images of intact sorghum leaves, captured using a Dino-Lite Edge/5 MP AM7515 digital microscope in a solar-weave greenhouse under ambient light conditions. Each image is acquired with leaves physiologically active, producing consistent imaging contexts and facilitating trait comparisons across genotypes and surfaces. Images have a native resolution of 2592×19442592{\times}1944 pixels and are stored in JPEG format. Metadata are systematically embedded in the filenames, encoding genotype (five sorghum lines: QL12, TX7000, R931945-2-2, SC170-6-8, SC237-14E), leaf level (L9–L18, FL), leaf surface (adaxial/abaxial), and blade region (base, mid, tip).

From these 318 raw images, a patch extraction protocol yields 11,060 human-annotated patches (341×341341{\times}341 pixels each, $10$-pixel overlap). A further 56,428 pseudo-labelled patches are generated using a seed Mask R-CNN instance segmentation model for semi-supervised training, culminating in a total of 67,488 unique patch images.

2. Classes, Hierarchy, and Annotation Protocol

Three biologically distinct, spatially nested object classes are annotated:

  • Complex area: the whole stomatal complex, comprising both guard cells and pore.
  • Guard cell area: only the paired guard cells, without the pore.
  • Pore area: the stomatal opening, present only in open complexes (absent in closed stomata).

For semantic segmentation, a fourth background class is implicitly defined. The inclusion hierarchy is pore ⊂\subset guard cell ⊂\subset complex area. Annotation is conducted by three trained biologists, guided by pre-annotation consensus and documented protocols. Polygonal masks are drawn with pixel precision, prohibiting overlap between different object instances or classes. Ambiguous or partially visible structures are flagged for review, with only well-defined stomatal components retained in the final annotation set. Quality control is performed by an expert reviewer with an overall review pass rate of 90%.

Class Mask Count (approx. human-annotated) Median Mask Coverage per Patch
Complex area ~33,000 ~2–3%
Guard cell area ~33,000 ~2%
Pore area ~9,700 <1%

3. Patch Extraction, Pre-processing, and Filtering

Patches (341×341341{\times}341 pixels) are extracted from the full-resolution raw images using a stride of 331 pixels, yielding $10$-pixel overlaps and complete tiling coverage. Extra tiles ensure image borders are fully sampled. Stringent filtering is applied: a stomatal instance is included in a patch only if more than 50% of its pixel area falls within that patch, and patches lacking any human-visible annotation (resulting usually from blur or out-of-focus regions) are discarded. Approximately 25%25\% of raw patches are removed by this process, resulting in an average of three stomatal complexes per retained human-annotated patch.

No additional image augmentation is included natively within the dataset, though later model training code may apply standard transformations such as flips or scaling.

4. Dataset Structure, Organization, and Splits

StomataSeg is published on Zenodo under a CC BY-NC-ND 4.0 license (non-commercial, no derivatives), accessible at DOI 10.5281/zenodo.18216859. The dataset is organized into three principal directories: original_images/ (raw images + COCO JSON), original_patches/ (human-annotated patches), and pseudo_labels_patches/ (Mask R-CNN–based pseudo-labelled patches). Each split—training, validation, and test—is accompanied by a COCO-style JSON file encoding segmentation masks as polygons and corresponding patch images.

The dataset is split as follows:

Split Human-Annotated Patches Pseudo-Labelled Patches Source Images
Training 7,662 56,428 222
Validation 2,238 0 63
Test 1,160 0 33

For semi-supervised experiments, the training set consists of both human-annotated and pseudo-labelled patches (total 64,090).

5. Annotation Workflow and Quality Assurance

Annotations are performed via the V7 cloud-based platform, utilizing polygon-masking tools, collaborative review, and progress tracking. Annotators undergo pre-annotation consensus training. For each structure, masks are non-overlapping at both the instance and class level. Annotation is restricted to clearly visible stomatal structures, with ambiguous features escalated for expert resolution. The annotation effort encompasses approximately 40,750 object masks, with a typical annotation time of 15 seconds per mask and a median of 25 minutes, 25 seconds per image. Iterative expert review ensures consistency and adherence to annotation standards, achieving a final pass rate of 90% for all images.

6. Biological and Statistical Coverage

The core 318-image set samples extensively across five sorghum genotypes, both leaf surfaces (adaxial/abaxial), three blade regions (base, mid, tip), and leaf levels L9–L18/FL, yielding biological diversity suitable for robust phenotyping studies. Mask coverage for the main classes is sparse, with complex and guard cell areas occupying 2–3% and 2% (respectively) of patch area by median, and pores covering less than 1%.

Dataset statistics (aggregated over all patches):

  • Total human-annotated instance masks: 40,750.
  • Complex and guard cell areas each have ∼\sim33,000 annotated instances, while the open-pore area is annotated in ∼\sim9,700 instances (reflecting that only ∼\sim25\% of stomatal complexes possess a visible, open pore in these images).
  • Each human-annotated patch contains, on average, three stomatal complexes.

7. Evaluation Metrics and Benchmarking

Performance evaluation utilizes established metrics for both semantic and instance segmentation:

For semantic segmentation (over CC classes): mIoU=1C∑c=1CTPcTPc+FPc+FNc,mAcc=1C∑c=1CTPcTPc+FNc\mathrm{mIoU} = \frac{1}{C}\sum_{c=1}^{C}\frac{TP_c}{TP_c + FP_c + FN_c}, \quad \mathrm{mAcc} = \frac{1}{C}\sum_{c=1}^{C}\frac{TP_c}{TP_c + FN_c} where TPcTP_c, FPcFP_c, FNcFN_c are the true positive, false positive, and false negative pixel counts for class cc.

For instance segmentation (over NN classes): mAP=1N∑i=1NAPi,APi=∫01pi(r) dr\mathrm{mAP} = \frac{1}{N}\sum_{i=1}^{N}\mathrm{AP}_i, \quad \mathrm{AP}_i = \int_{0}^{1} p_i(r)\,dr where pi(r)p_i(r) is the precision–recall curve for class ii, and AP50\mathrm{AP}_{50} denotes average precision at an IoU threshold of 0.5.

Benchmarking shows that augmenting training with pseudo-labelled patches increases the top semantic-segmentation mIoU from 65.93% to 70.35%, and the top instance-segmentation AP from 28.30% to 46.10%. This demonstrates that patch-based preprocessing in conjunction with semi-supervised learning significantly enhances fine-grained stomatal structure segmentation (Huang et al., 31 Jan 2026).

8. Access, Usage, and Integration

StomataSeg is intended for non-commercial research in stomatal phenotyping. Data and annotations can be fetched from Zenodo and natively loaded with standard COCO dataset wrappers. There are no separate mask-image files; all annotation is stored in polygon form within the COCO JSONs. The dataset directly supports integration into Mask R-CNN or semantic segmentation pipelines for trait extraction. For reproduction of training, evaluation, and pseudo-labelling, code is provided at https://github.com/Davidhzt/StomataSeg_full (Huang et al., 31 Jan 2026). Use for derivative datasets or commercial applications requires separate permission from the dataset providers.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to StomataSeg Dataset.