Effect of Input Resolution on Retinal Vessel Segmentation Performance: An Empirical Study Across Five Datasets

Published 3 Apr 2026 in cs.CV | (2604.02977v1)

Abstract: Most deep learning pipelines for retinal vessel segmentation resize fundus images to satisfy GPU memory constraints and enable uniform batch processing. However, the impact of this resizing on thin vessel detection remains underexplored. When high resolution images are downsampled, thin vessels are reduced to subpixel structures, causing irreversible information loss even before the data enters the network. Standard volumetric metrics such as the Dice score do not capture this loss because thick vessel pixels dominate the evaluation. We investigated this effect by training a baseline UNet at multiple downsampling ratios across five fundus datasets (DRIVE, STARE, CHASE_DB1, HRF, and FIVES) with native widths ranging from 565 to 3504 pixels, keeping all other settings fixed. We introduce a width-stratified sensitivity metric that evaluates thin (half-width <3 pixels), medium (3 to 7 pixels), and thick (>7 pixels) vessel detection separately, using native resolution width estimates derived from a Euclidean distance transform. Results show that for high-resolution datasets (HRF, FIVES), thin vessel sensitivity improves monotonically as images are downsampled toward the encoder's effective operating range, peaking at processed widths between 256 and 876 pixels. For low-to-mid resolution datasets (DRIVE, STARE, CHASE_DB1), thin vessel sensitivity is highest at or near native resolution and degrades with any downsampling. Across all five datasets, aggressive downsampling reduced thin vessel sensitivity by up to 15.8 percentage points (DRIVE) while Dice remained relatively stable, confirming that Dice alone is insufficient for evaluating microvascular segmentation.

Abstract PDF Upgrade to Chat

Authors (1)

Amarnath R

Summary

The paper shows that input resolution directly impacts thin vessel sensitivity, with native or near-native resolution optimizing performance for low–mid-resolution datasets and moderate downsampling benefiting high-resolution images.
It demonstrates that conventional Dice coefficients obscure critical microvascular losses, emphasizing the need for width-stratified sensitivity to accurately evaluate thin vessel detection.
The findings suggest adapting preprocessing and network architectures to align input scale with the encoder’s receptive field, thereby improving clinical screening and diagnostic outcomes.

Effect of Input Resolution on Retinal Vessel Segmentation: An Empirical Analysis

Introduction

The segmentation of retinal vasculature in fundus images is essential for automated analysis of diabetic and hypertensive retinopathy. Modern pipelines predominantly utilize encoder-decoder convolutional networks, such as UNet, with routine preprocessing by global image resizing to conform to GPU memory and facilitate batch processing. However, this resizing introduces variable loss of microvascular information as thin vessels, particularly those below 3 pixels width, become susceptible to subpixel erasure during aggressive downsampling. Conventional volumetric metrics, especially the Dice coefficient, do not adequately reflect missed thin vessels, leading to misleading performance assessments and confounded cross-dataset benchmark comparisons.

Experimental Design

A controlled ablation study was performed on five widely used retinal fundus datasets—DRIVE, STARE, CHASE_DB1, HRF, and FIVES—spanning native resolutions from 565 px up to 3504 px, thereby covering the typical range of clinical fundus acquisition protocols. A fixed 4-stage UNet (1.9M parameters, standard double conv-blocks and max pooling) was trained on five systematically varied resizing conditions per dataset. All training configurations and augmentations were held constant across experiments.

To address the blind spots of volumetric metrics, a width-stratified sensitivity evaluation was implemented. Using a Euclidean distance transform, vessel pixels of the native ground truth mask were stratified into thin (half-width < 3 px), medium (3–7 px), and thick (> 7 px) categories, and sensitivity was reported separately for each class.

Main Findings

Resolution-Dependent Thin Vessel Sensitivity

Analysis revealed a strong dataset-dependent effect of input resizing on thin vessel sensitivity. For low- to mid-resolution datasets such as DRIVE (565 px) and STARE (700 px), the highest thin vessel sensitivity was achieved at or near native resolution (DRIVE: 0.6723; STARE: 0.6665). Any downsampling resulted in a marked and consistent drop, with up to 15.8 percentage points loss in DRIVE and 12.0 in STARE under 4× downsampling, confirming that reducing these images below acquisition resolution is detrimental to microvascular segmentation.

Conversely, in high-resolution datasets (HRF at 3504 px, FIVES at 2048 px), moderate downsampling improved thin vessel sensitivity (HRF: 0.5847 → 0.6495 at 4×; FIVES: 0.5660 → 0.6595 at 8×), indicating that thin vessels are more robustly encoded as multi-pixel structures when image scale better matches the encoder's effective receptive field. Aggressive downsampling beyond the optimal window decreased Dice and specificity, reflecting a trade-off between thin vessel recall and increased false positives.

CHASE_DB1 (1280 px), representing a mid-resolution, demonstrated a flatter response curve, with peak thin vessel sensitivity at native resolution but only minor change across downsampling, underscoring the interaction between dataset acquisition parameters and network bottleneck resolution.

Limitations of Standard Volumetric Metrics

Dice coefficient and global sensitivity remained largely stable under varying resolution conditions, particularly for high-resolution datasets, masking substantial degradation in thin vessel detection. Notably, for HRF, thin vessel sensitivity improved with downsampling while Dice remained nearly unchanged, highlighting that Dice is insufficient as a sole performance indicator in microvascular segmentation. Spearman correlation analysis further confirmed this decoupling (ρ = –0.90 for HRF).

Benchmark Comparability and Clinical Implications

Segmenter evaluations using only volumetric Dice across datasets processed at differing effective resolutions yield incomparable results. A model reported on DRIVE at native resolution cannot be equitably compared to one on HRF downsampled to 512 px, as thin vessel detection operates under different structural priors. The introduction of width-stratified sensitivity provides necessary granularity for reproducible and interpretable benchmarking.

From a clinical utility perspective, these findings indicate that for population screening—where early microvascular pathology is the primary concern—the smallest input size within the encoder's optimal operating window should be selected, prioritizing thin vessel recall over precision. In diagnostic confirmation settings, maximizing overall segmentation accuracy and reducing false positives may require less aggressive downsampling, adjusting the sensitivity-precision trade-off.

Implications for Architecture and Future Research

These results challenge the convention of fixed image resizing pipelines in retinal vessel segmentation and argue for architectures and benchmarking protocols that explicitly account for native dataset resolution. While multiscale and topology-preserving approaches offer partial remediation (see e.g., FPNs, clDice [20]), they do not fundamentally recover vessel structures lost to preprocessing-induced aliasing.

The study was confined to a basic UNet backbone to isolate the effect of input scaling; generalization to architectures with dilated convolutions, transformers, or hybrid multiscale designs remains to be quantified. The comparison between optimally-resized whole-image pipelines and patch-based training at native resolution is also an open direction with potential to yield more resolution-robust segmentation.

Conclusion

The empirical study demonstrates that the choice of input resolution exerts a dataset and vessel-width dependent influence on retinal vessel segmentation performance. Thin vessel sensitivity is maximized at native or near-native resolution for low- to mid-resolution datasets, and at moderate downsampling for high-resolution datasets, owing to the alignment of vascular scale with the encoder’s representation capacity. Volumetric metrics such as Dice are inadequate for revealing these effects, necessitating adoption of width-stratified sensitivity for meaningful inter-dataset and model comparisons. These findings mandate that both architectural selection and reported evaluation metrics be explicitly calibrated relative to native dataset resolution and structural target, to yield clinically meaningful and reproducible segmentation outcomes.

Markdown Report Issue