Uncertainty-Guided Attention and Entropy-Weighted Loss for Precise Plant Seedling Segmentation

Published 12 Apr 2026 in cs.CV and cs.LG | (2604.10823v1)

Abstract: Plant seedling segmentation supports automated phenotyping in precision agriculture. Standard segmentation models face difficulties due to intricate background images and fine structures in leaves. We introduce UGDA-Net (Uncertainty-Guided Dual Attention Network with Entropy-Weighted Loss and Deep Supervision). Three novel components make up UGDA-Net. The first component is Uncertainty-Guided Dual Attention (UGDA). UGDA uses channel variance to modulate feature maps. The second component is an entropy-weighted hybrid loss function. This loss function focuses on high-uncertainty boundary pixels. The third component employs deep supervision for intermediate encoder layers. We performed a comprehensive systematic ablation study. This study focuses on two widely-used architectures, U-Net and LinkNet. It analyzes five incremental configurations: Baseline, Loss-only, Attention-only, Deep Supervision, and UGDA-Net. We trained UGDA-net using a high-resolution plant seedling image dataset containing 432 images. We demonstrate improved segmentation performance and accuracy. With an increase in Dice coefficient of 9.3% above baseline. LinkNet's variance is 13.2% above baseline. Overlays that are qualitative in nature show the reduced false positives at the leaf boundary. Uncertainty heatmaps are consistent with the complex morphology. UGDA-Net aids in the segmentation of delicate structures in plants and provides a high-def solution. The results showed that uncertainty-guided attention and uncertainty-weighted loss are two complementing systems.

Abstract PDF Upgrade to Chat

Authors (2)

Summary

The paper introduces UGDA-Net that fuses uncertainty-guided dual attention with entropy-weighted loss, significantly improving boundary precision in plant seedling segmentation.
It combines an encoder-decoder architecture with deep supervision, achieving a 9.3%-13.2% Dice coefficient improvement compared to standard baselines.
Qualitative results demonstrate reduced false positives and negatives, enabling robust segmentation of fine plant structures in complex agricultural scenes.

Uncertainty-Guided Attention and Entropy-Weighted Loss for Precise Plant Seedling Segmentation

Introduction and Motivation

Segmentation of plant seedlings is a critical enabling technology for high-throughput phenotyping within precision agriculture, confronting the significant challenge posed by complex environments, such as cluttered backgrounds and intricate plant morphology. Conventional CNN-based architectures, including U-Net and LinkNet, have seen wide deployment but tend to falter on the boundaries and fine structures—often producing substantial false positives due to soil or container interference. Recent advances have explored attention mechanisms and uncertainty quantification, yet these approaches have been treated as separate modules without integration. The paper proposes UGDA-Net (Uncertainty-Guided Dual Attention Network with Entropy-Weighted Loss and Deep Supervision) (2604.10823), an architecture that explicitly combines uncertainty estimation, feature attention, and loss adaptation, synergistically addressing the shortcomings of prior segmentation pipelines.

Figure 1: Sample from the plant seedling segmentation dataset illustrating the complex background and fine plant structures encountered.

Methodology

Plant Seedling Dataset and Preprocessing

The experiments employ a public dataset of 432 high-resolution RGB images, each annotated at the pixel level for plant versus background. Standard training-validation-testing splits (70%-15%-15%) and comprehensive augmentations (resize to $256\times256$ px, horizontal flips, brightness/contrast modulation, Gaussian blur) facilitate improved generalization. Channel normalization follows ImageNet statistics.

UGDA-Net Architecture Description

UGDA-Net is designed atop the encoder-decoder paradigm, augmenting popular backbone networks (U-Net, LinkNet) with three key innovations:

Uncertainty-Guided Dual Attention (UGDA): At each encoder stage (other than the shallowest), UGDA computes channel and spatial attention maps, then fuses a pixel-level uncertainty signal derived from the channel-wise standard deviation. This uncertainty is incorporated multiplicatively into the attention modulation, focusing network capacity on ambiguous regions at the boundaries.
Entropy-Weighted Hybrid Loss: The pixel-wise entropy (from model probabilities) is calculated and used to scale the binary cross-entropy component, combined linearly with Dice loss ( $\lambda_1=0.7$ , $\lambda_2=0.3$ ). A warm-up phase starts training without entropy weighting for the initial epochs ( $\beta=0.3$ empirically selected).
Deep Supervision: Auxiliary segmentation heads are placed at two deep encoder layers, weighted ( $w_1=0.3$ , $w_2=0.7$ ) within the overall loss, stabilizing gradient flow and enforcing multi-scale feature consistency.

Quantitative Analysis and Ablation Study

A systematic ablation study compares five configurations (Baseline, Loss-only, Attention-only, Deep Supervision-only, and full UGDA-Net) for both U-Net and LinkNet backbones using Dice coefficient and IoU metrics.

UGDA-Net achieves a 9.3% Dice improvement on U-Net and 13.2% Dice increase on LinkNet over baselines.
Entropy-weighted loss shows the highest standalone impact: For U-Net, Loss-only yields a 7.6% Dice increase; for LinkNet, 12% gain.
The attention module alone results in marginal improvement (U-Net: +0.9%; LinkNet: +0.89%), but, when combined, synergistically supports uncertainty-driven boundary refinement.
Deep supervision yields mixed results: negligible or negative impact for U-Net (likely due to inherent skip connections), but substantial improvement for LinkNet (+5.6% Dice).

These results confirm that uncertainty-guided supervision is highly effective for resolving boundary ambiguity and segmenting fine plant structures, especially in architectures with weaker intrinsic multiscale connection schemes.

Qualitative Evaluation

UGDA-Net consistently produces segmentation masks that delineate leaf boundaries with higher fidelity, effectively suppresses false positives near soil, and successfully captures fine features such as leaf tips and narrow stems. Uncertainty heatmaps tightly correlate with boundary regions, indicating that model uncertainty is indeed localized at the zones of semantic ambiguity.

Figure 2: Input image used for qualitative evaluation, demonstrating the complexity encountered in practical segmentation tasks.

Further overlays show reduction in false negatives and false positives when UGDA-Net's predictions are compared to both Baseline and Loss-only configurations. Color-coded overlays (green: true positives, red: false positives, blue: false negatives) visually endorse the quantitative findings by displaying boundary precision enhancements.

Practical, Theoretical Implications, and Future Directions

The deterministic uncertainty calculation (channel-wise std and entropy) used by UGDA-Net offers computational efficiency compared to Bayesian or test-time augmented schemes, making it amenable to real-time phenotyping deployments. By integrating uncertainty signals directly into both feature refinement and loss weighting, the model strategically allocates representational capacity and penalizes ambiguous predictions, facilitating robust learning of delicate plant morphology.

The approach also has broader implications: the simultaneous use of uncertainty in attention and loss could generalize to any semantic segmentation domain where boundary precision and class imbalance are paramount. Potential future directions include integration with transformer-based encoders, exploration of alternate uncertainty proxies (e.g., Mahalanobis distance in feature space), evaluation across multi-species datasets, and optimization for edge deployment (real-time inference on embedded devices).

Conclusion

UGDA-Net establishes a formal paradigm for uncertainty-aware plant seedling segmentation by leveraging uncertainty-guided attention, entropy-weighted loss, and deep supervision within a unified architecture. Substantial improvements in boundary accuracy, validated through both numerical and qualitative metrics, underscore the effectiveness of combining uncertainty with feature attention and adaptive loss in semantic segmentation. This framework directly improves automated phenotyping pipelines and suggests fertile research directions for uncertainty propagation and robust segmentation in agriculture and beyond.

Markdown Report Issue