Papers
Topics
Authors
Recent
Search
2000 character limit reached

Minimum Boundary Shift (MBS) Loss

Updated 9 February 2026
  • Minimum Boundary Shift (MBS) Loss is a segmentation loss that computes distance metrics over contours, addressing class imbalance in tasks like medical imaging.
  • It reformulates the non-symmetric L2 distance between predicted and ground-truth boundaries into a tractable regional integral, enabling end-to-end gradient computation.
  • Empirical results show that when combined with regional losses, MBS Loss enhances small lesion recovery and boundary sharpness with minimal computational overhead.

Minimum Boundary Shift (MBS) Loss, introduced by Kervadec et al., is a segmentation loss function designed to address challenges in highly unbalanced segmentation problems, particularly those common in medical imaging. Whereas standard losses like Dice or cross-entropy integrate over segmentation regions, the boundary (MBS) loss is formulated as a distance metric over the space of contours (boundaries) rather than regions. This construction enables losses that are less sensitive to class imbalance, and it can be seamlessly combined with conventional regional loss functions to yield improved training stability and accuracy in segmentation tasks, especially when the foreground is significantly smaller than the background (Kervadec et al., 2018).

1. Mathematical Definition and Contour Distance

Consider a segmentation domain ΩRd\Omega\subset\mathbb{R}^d. Let GΩG \subset \Omega denote the ground-truth foreground region and G\partial G its boundary. Given a segmentation SS predicted by a neural network (constructed by thresholding softmax output sθ(x)s_\theta(x)), the segmentation boundary is S\partial S. The Minimum Boundary Shift (MBS) or boundary loss is based on the non-symmetric L2L_2 “contour” distance from G\partial G to S\partial S:

d(G,S)=pSφG(p)dp,d(\partial G, \partial S) = \int_{p \in \partial S} \varphi_{\partial G}(p)\, dp,

where φG(x)\varphi_{\partial G}(x) is the signed distance transform from G\partial G, defined as:

φG(x)={+DG(x)xG, DG(x)xG,\varphi_{\partial G}(x) = \begin{cases} +D_G(x) & x \notin G, \ -D_G(x) & x \in G, \end{cases}

with DG(x)=minzGxzD_G(x) = \min_{z \in \partial G} \|x-z\| the Euclidean distance to the ground-truth boundary. In practice, DGD_G is efficiently computed via standard algorithms like SciPy’s distance_transform_edt on both GG and its complement.

2. Reformulation as Regional Integral on the Softmax Output

Direct calculation and differentiation of integrals over boundaries S\partial S is non-trivial. Kervadec et al. reformulate the contour distance as a regional integral, yielding an expression tractable for deep learning pipelines. Specifically:

12d(G,S)=SφG(q)dqGφG(q)dq=ΩφG(q)s(q)dqΩφG(q)g(q)dq,\frac{1}{2} d(\partial G, \partial S) = \int_{S} \varphi_G(q)\, dq - \int_{G} \varphi_G(q)\, dq = \int_{\Omega} \varphi_G(q) s(q)\, dq - \int_{\Omega} \varphi_G(q) g(q)\, dq,

where s(q)s(q) and g(q)g(q) are the indicator functions for the prediction and ground truth, respectively. Substituting the softmax probability sθ(q)s_\theta(q) for the hard prediction yields the boundary loss:

LB(θ)=ΩφG(q)sθ(q)dq.L_B(\theta) = \int_{\Omega} \varphi_G(q) s_\theta(q)\, dq.

For multi-class segmentation with CC classes and softmax outputs Pc(x)P_c(x), with signed distance maps φc(x)\varphi_c(x) for each class cc:

LB(θ)=c=1CΩφc(x)Pc(x)dx.L_B(\theta) = \sum_{c=1}^{C} \int_{\Omega} \varphi_c(x) P_c(x)\, dx.

This formulation allows the loss to be computed as a region-wise inner product, leveraging precomputed signed distance maps.

3. Differentiability, Computational Characteristics, and Implementation

The boundary loss is a linear functional of the softmax outputs. Its partial derivatives with respect to the softmax probability at location xx are:

LBsθ(x)=φ(x)\frac{\partial L_B}{\partial s_\theta(x)} = \varphi(x)

and, for class cc in the multiclass case, LBPc(x)=φc(x)\frac{\partial L_B}{\partial P_c(x)} = \varphi_c(x). The signed distance maps φ\varphi are precomputed (and constant with respect to θ\theta), resulting in a trivial integration with any modern autodiff system.

From an implementation perspective:

  • φ\varphi may attain large values far from the boundary. Clamping or normalizing φ\varphi (e.g., by the image size) is advised for numerical stability.
  • Efficient generation of signed distance maps leverages existing libraries on either 2D slices or entire 3D volumes, with negative signs assigned to the interior of each ground-truth region.
  • The computational cost of evaluating boundary loss is negligible ($1$–$2$ ms per batch), markedly more efficient than alternatives like the Hausdorff loss (which incurs \sim10% slow-down over typical region losses).

4. Augmentation with Regional Losses and Scheduling Strategies

When used in isolation, boundary loss alone often yields trivial solutions (e.g., empty foreground, sθ0s_\theta \approx 0 everywhere), as the gradients vanish rapidly. Therefore, practical usage always augments a regional loss LRL_R with the boundary term, leading to a composite loss:

L(θ)=LR(θ)+αLB(θ)L(\theta) = L_R(\theta) + \alpha \cdot L_B(\theta)

where α\alpha is a tunable scalar. Effective choices for LRL_R include:

To balance the influence of boundary and regional terms during training, a “rebalance” schedule is recommended: start with a small α0\alpha_0 (e.g., $0.01$), increasing by increments (e.g., +0.01+0.01 per epoch) until reaching a target α\alpha (0.51.00.5 \sim 1.0), so early optimization is guided by LRL_R, with the boundary term taking over as learning progresses.

5. Empirical Results and Comparative Performance

Boundary loss was evaluated on highly unbalanced medical segmentation datasets:

  • ISLES Ischemic Stroke Lesion: 94 scans; strong foreground-background imbalance.
  • WMH White-Matter Hyperintensities: 60 scans; similar class imbalance.

The following summarizes results (ISLES, using GDL as LRL_R):

Loss Type Dice Coefficient (DSC) 95%-Hausdorff (HD95, mm)
GDL only ≈0.51 ≈5.32
+ Boundary (2D) 0.64 (+13%) 4.80
+ Boundary (3D) 0.66 2.72

Similar, though smaller, improvements were observed when using cross-entropy or focal loss as LRL_R. MBS loss promoted better recovery of small lesions, yielded sharper boundaries, and reduced false positives, while incurring a negligible computational overhead compared to Hausdorff loss-based approaches.

6. Practical Considerations and Significance

The MBS loss is straightforward to implement due to its linear formulation and closed-form gradients. Its main strengths are:

  • Robust handling of class imbalance, since its integrals are over the interface region rather than the full segmentation mask.
  • Easy integration with standard architectures and training pipelines.
  • Consistent empirical improvement in both region- and boundary-based metrics, with added stability during training in highly unbalanced scenarios.
  • Minimal additional computational cost.

A plausible implication is that, for highly imbalanced segmentation tasks (particularly prevalent in medical imaging), boundary loss offers a principled and efficient augmentation to existing region-based loss functions, enhancing both the quantitative and qualitative accuracy of neural network-based segmentation systems (Kervadec et al., 2018).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Minimum Boundary Shift (MBS) Loss.