Minimum Boundary Shift (MBS) Loss
- Minimum Boundary Shift (MBS) Loss is a segmentation loss that computes distance metrics over contours, addressing class imbalance in tasks like medical imaging.
- It reformulates the non-symmetric L2 distance between predicted and ground-truth boundaries into a tractable regional integral, enabling end-to-end gradient computation.
- Empirical results show that when combined with regional losses, MBS Loss enhances small lesion recovery and boundary sharpness with minimal computational overhead.
Minimum Boundary Shift (MBS) Loss, introduced by Kervadec et al., is a segmentation loss function designed to address challenges in highly unbalanced segmentation problems, particularly those common in medical imaging. Whereas standard losses like Dice or cross-entropy integrate over segmentation regions, the boundary (MBS) loss is formulated as a distance metric over the space of contours (boundaries) rather than regions. This construction enables losses that are less sensitive to class imbalance, and it can be seamlessly combined with conventional regional loss functions to yield improved training stability and accuracy in segmentation tasks, especially when the foreground is significantly smaller than the background (Kervadec et al., 2018).
1. Mathematical Definition and Contour Distance
Consider a segmentation domain . Let denote the ground-truth foreground region and its boundary. Given a segmentation predicted by a neural network (constructed by thresholding softmax output ), the segmentation boundary is . The Minimum Boundary Shift (MBS) or boundary loss is based on the non-symmetric “contour” distance from to :
where is the signed distance transform from , defined as:
with the Euclidean distance to the ground-truth boundary. In practice, is efficiently computed via standard algorithms like SciPy’s distance_transform_edt on both and its complement.
2. Reformulation as Regional Integral on the Softmax Output
Direct calculation and differentiation of integrals over boundaries is non-trivial. Kervadec et al. reformulate the contour distance as a regional integral, yielding an expression tractable for deep learning pipelines. Specifically:
where and are the indicator functions for the prediction and ground truth, respectively. Substituting the softmax probability for the hard prediction yields the boundary loss:
For multi-class segmentation with classes and softmax outputs , with signed distance maps for each class :
This formulation allows the loss to be computed as a region-wise inner product, leveraging precomputed signed distance maps.
3. Differentiability, Computational Characteristics, and Implementation
The boundary loss is a linear functional of the softmax outputs. Its partial derivatives with respect to the softmax probability at location are:
and, for class in the multiclass case, . The signed distance maps are precomputed (and constant with respect to ), resulting in a trivial integration with any modern autodiff system.
From an implementation perspective:
- may attain large values far from the boundary. Clamping or normalizing (e.g., by the image size) is advised for numerical stability.
- Efficient generation of signed distance maps leverages existing libraries on either 2D slices or entire 3D volumes, with negative signs assigned to the interior of each ground-truth region.
- The computational cost of evaluating boundary loss is negligible ($1$–$2$ ms per batch), markedly more efficient than alternatives like the Hausdorff loss (which incurs 10% slow-down over typical region losses).
4. Augmentation with Regional Losses and Scheduling Strategies
When used in isolation, boundary loss alone often yields trivial solutions (e.g., empty foreground, everywhere), as the gradients vanish rapidly. Therefore, practical usage always augments a regional loss with the boundary term, leading to a composite loss:
where is a tunable scalar. Effective choices for include:
- Generalized Dice Loss (GDL, hyperparameter-free for binary cases)
- Weighted Cross-Entropy (as in U-Net)
- Focal loss ()
- Karimi’s Hausdorff loss (for comparison)
To balance the influence of boundary and regional terms during training, a “rebalance” schedule is recommended: start with a small (e.g., $0.01$), increasing by increments (e.g., per epoch) until reaching a target (), so early optimization is guided by , with the boundary term taking over as learning progresses.
5. Empirical Results and Comparative Performance
Boundary loss was evaluated on highly unbalanced medical segmentation datasets:
- ISLES Ischemic Stroke Lesion: 94 scans; strong foreground-background imbalance.
- WMH White-Matter Hyperintensities: 60 scans; similar class imbalance.
The following summarizes results (ISLES, using GDL as ):
| Loss Type | Dice Coefficient (DSC) | 95%-Hausdorff (HD95, mm) |
|---|---|---|
| GDL only | ≈0.51 | ≈5.32 |
| + Boundary (2D) | 0.64 (+13%) | 4.80 |
| + Boundary (3D) | 0.66 | 2.72 |
Similar, though smaller, improvements were observed when using cross-entropy or focal loss as . MBS loss promoted better recovery of small lesions, yielded sharper boundaries, and reduced false positives, while incurring a negligible computational overhead compared to Hausdorff loss-based approaches.
6. Practical Considerations and Significance
The MBS loss is straightforward to implement due to its linear formulation and closed-form gradients. Its main strengths are:
- Robust handling of class imbalance, since its integrals are over the interface region rather than the full segmentation mask.
- Easy integration with standard architectures and training pipelines.
- Consistent empirical improvement in both region- and boundary-based metrics, with added stability during training in highly unbalanced scenarios.
- Minimal additional computational cost.
A plausible implication is that, for highly imbalanced segmentation tasks (particularly prevalent in medical imaging), boundary loss offers a principled and efficient augmentation to existing region-based loss functions, enhancing both the quantitative and qualitative accuracy of neural network-based segmentation systems (Kervadec et al., 2018).