GLCP: Global-to-Local Connectivity in Segmentation
- GLCP is a framework that preserves global topology and local continuity in segmenting tubular structures, crucial for accurate biomedical analysis.
- It employs a shared encoder–decoder with an Interactive Multi-head Segmentation module that jointly predicts global, skeleton, and discontinuity maps.
- The Dual-Attention-based Refinement module fuses global and local cues, resulting in improved Dice scores, topological integrity, and reduced segmentation errors.
Accurate segmentation of tubular structures—such as vascular, neural, or other biological networks—requires simultaneous preservation of global topology and local continuity to support robust clinical and scientific analysis. The Global-to-Local Connectivity Preservation (GLCP) framework (Zhou et al., 28 Jul 2025) was developed to address structural fragmentation (notably, the breaking of vessel branches or loss of connectivity) in deep learning–based segmentation of such networks. GLCP’s innovation is an architecture and training procedure that concurrently models global network structure and local discontinuities, using explicitly designed multi-head modules and refinement mechanisms.
1. Architectural Design and Global–Local Integration
GLCP’s core is a shared encoder–decoder backbone (e.g., nnUNet or transformer-based SwinUNETR) that extracts hierarchical features from volumetric or sliced image patches. This backbone branches into the Interactive Multi-head Segmentation (IMS) module, where three parallel segmentation heads are trained jointly:
- Global segmentation head: produces the main segmentation map, striving for global topological completeness.
- Skeleton prediction head: outputs a skeletonization map, representing the medial axis or “backbone” of the tubular network, which helps maintain structural connectivity across large spatial scales.
- Discontinuity prediction head: predicts local discontinuity maps, explicitly identifying regions (often endpoints or branch breaks) where the segmentation is likely to be fragmented.
The IMS outputs are then fused in the Dual-Attention-based Refinement (DAR) module, which leverages both global (skeleton-derived) and local (discontinuity-derived) attentional cues to refine the initial segmentation map. The overall framework is modular and designed to be agnostic to the choice of encoder–decoder backbone.
2. Interactive Multi-head Segmentation (IMS) Module
The IMS module is designed to jointly supervise and reinforce both global and local quality criteria:
- Skeleton and Discontinuity Label Construction: Skeletons are extracted from both ground truth (GT) and predicted segmentations using standard thinning algorithms. Endpoints are detected via convolutional filtering. For predicted endpoints , the minimal Euclidean distance to GT endpoints is calculated:
A dynamic threshold
is computed, and endpoints beyond this threshold are marked as likely discontinuity points. Redundant or closely-clustered points are collapsed using DBSCAN. For each such point, a spatial mask (cube) is created for discontinuity labeling.
- Self-supervised Consistency Loss: To enforce consistency between global segmentation and skeleton predictions, a Kullback–Leibler (KL) divergence–based loss is imposed between the skeleton extracted from the segmentation and the skeleton head’s output:
with as softmax, the ground-truth skeleton, and a gradient truncation function to prevent interference during optimization. This loss ensures the global segmentation and skeleton branches reinforce each other’s topological predictions.
3. Dual-Attention-based Refinement (DAR) Module
The DAR module leverages attention mechanisms to refine the raw segmentation map:
- Global attention is derived from the skeleton map , highlighting connected tubular regions.
- Local attention is from the discontinuity map , focusing correction on predicted breakpoints.
- The refined segmentation is:
Here, and are convolutions, and produces the respective attention masks. This lightweight design allows efficient fusion of contextual cues without significant additional computational burden.
4. Quantitative and Qualitative Performance
GLCP was benchmarked on multiple 2D (STARE retinal vessel) and 3D (CCA, TopCoW vascular) datasets. Results demonstrate:
- Superior segmentation accuracy: Dice and clDice scores surpass baselines (e.g., improvement from 82.92% to 83.67% Dice on STARE compared to nnUNet).
- Better topological integrity: Lower Betti number errors (e.g., ) and reductions in connected component fragmentation, indicating more complete vessel networks.
- Boundary quality: Lower Hausdorff Distances.
- Ablation studies confirm that skeleton prediction, discontinuity detection, self-supervised consistency, and the DAR module each contribute measurably to final performance.
Comparisons with specialized loss functions (clDice, cbDice, ske-recall) and multi-decoder alternatives establish GLCP’s superiority in combined accuracy and continuity.
5. Implications and Applications
GLCP’s ability to enhance both global and local structural preservation has meaningful downstream effects:
- Medical diagnostics: Improved vessel segmentation directly benefits retinal, cerebral, and coronary artery analysis, impacting disease diagnosis, risk stratification, and surgical planning.
- Biological network analysis: Accurate modeling of root, neural, or microvascular arrangements where connectivity is crucial.
- Non-medical domains: The framework is adaptable to curvilinear segmentation in remote sensing (e.g., roads from satellite imagery), plant biology, and engineered networks.
Its design ensures that breaking and reconnecting finely detailed local structures does not compromise the global network or vice versa.
6. Limitations and Future Directions
The primary limitations and avenues for further research are:
- Generalization: While validated across multiple encoder–decoder architectures, wider testing on more complex or atypical imaging modalities remains a research goal.
- Detection and handling of discontinuity points: Further refinement in dynamic thresholding, clustering, and integration of detected discontinuities may yield incremental improvements, as the present approach still depends on hand-tuned thresholds and patch-based selection.
- Computational complexity: Though the DAR module is lightweight, overall system complexity is increased. The balance between refinement accuracy and efficiency will benefit from additional studies.
A plausible implication is that the general framework—explicitly combining skeleton, segmentation, and discontinuity signals—could be transferred to other structured pattern recognition problems beyond medical imaging.
7. Conclusion
GLCP establishes a paradigm for global-to-local connectivity preservation in segmentation tasks, with explicit mechanisms for global structural modeling (skeletons), detection and correction of local discontinuities, and final attention-based refinement. The framework achieves state-of-the-art accuracy and continuity on vessel segmentation benchmarks and is supported by robust ablation and comparative studies. Its design principles and methodological innovations are broadly relevant for applications demanding connectivity preservation at multiple scales. Source code is to be released for community adoption and further advancement (Zhou et al., 28 Jul 2025).