Minimum Boundary Shift (MBS) Loss

Updated 9 February 2026

Minimum Boundary Shift (MBS) Loss is a segmentation loss that computes distance metrics over contours, addressing class imbalance in tasks like medical imaging.
It reformulates the non-symmetric L2 distance between predicted and ground-truth boundaries into a tractable regional integral, enabling end-to-end gradient computation.
Empirical results show that when combined with regional losses, MBS Loss enhances small lesion recovery and boundary sharpness with minimal computational overhead.

Minimum Boundary Shift (MBS) Loss, introduced by Kervadec et al., is a segmentation loss function designed to address challenges in highly unbalanced segmentation problems, particularly those common in medical imaging. Whereas standard losses like Dice or cross-entropy integrate over segmentation regions, the boundary (MBS) loss is formulated as a distance metric over the space of contours (boundaries) rather than regions. This construction enables losses that are less sensitive to class imbalance, and it can be seamlessly combined with conventional regional loss functions to yield improved training stability and accuracy in segmentation tasks, especially when the foreground is significantly smaller than the background (Kervadec et al., 2018).

1. Mathematical Definition and Contour Distance

Consider a segmentation domain $\Omega\subset\mathbb{R}^d$ . Let $G \subset \Omega$ denote the ground-truth foreground region and $\partial G$ its boundary. Given a segmentation $S$ predicted by a neural network (constructed by thresholding softmax output $s_\theta(x)$ ), the segmentation boundary is $\partial S$ . The Minimum Boundary Shift (MBS) or boundary loss is based on the non-symmetric $L_2$ “contour” distance from $\partial G$ to $\partial S$ :

$d(\partial G, \partial S) = \int_{p \in \partial S} \varphi_{\partial G}(p)\, dp,$

where $\varphi_{\partial G}(x)$ is the signed distance transform from $\partial G$ , defined as:

$\varphi_{\partial G}(x) = \begin{cases} +D_G(x) & x \notin G, \ -D_G(x) & x \in G, \end{cases}$

with $D_G(x) = \min_{z \in \partial G} \|x-z\|$ the Euclidean distance to the ground-truth boundary. In practice, $D_G$ is efficiently computed via standard algorithms like SciPy’s distance_transform_edt on both $G$ and its complement.

2. Reformulation as Regional Integral on the Softmax Output

Direct calculation and differentiation of integrals over boundaries $\partial S$ is non-trivial. Kervadec et al. reformulate the contour distance as a regional integral, yielding an expression tractable for deep learning pipelines. Specifically:

$\frac{1}{2} d(\partial G, \partial S) = \int_{S} \varphi_G(q)\, dq - \int_{G} \varphi_G(q)\, dq = \int_{\Omega} \varphi_G(q) s(q)\, dq - \int_{\Omega} \varphi_G(q) g(q)\, dq,$

where $s(q)$ and $g(q)$ are the indicator functions for the prediction and ground truth, respectively. Substituting the softmax probability $s_\theta(q)$ for the hard prediction yields the boundary loss:

$L_B(\theta) = \int_{\Omega} \varphi_G(q) s_\theta(q)\, dq.$

For multi-class segmentation with $C$ classes and softmax outputs $P_c(x)$ , with signed distance maps $\varphi_c(x)$ for each class $c$ :

$L_B(\theta) = \sum_{c=1}^{C} \int_{\Omega} \varphi_c(x) P_c(x)\, dx.$

This formulation allows the loss to be computed as a region-wise inner product, leveraging precomputed signed distance maps.

3. Differentiability, Computational Characteristics, and Implementation

The boundary loss is a linear functional of the softmax outputs. Its partial derivatives with respect to the softmax probability at location $x$ are:

$\frac{\partial L_B}{\partial s_\theta(x)} = \varphi(x)$

and, for class $c$ in the multiclass case, $\frac{\partial L_B}{\partial P_c(x)} = \varphi_c(x)$ . The signed distance maps $\varphi$ are precomputed (and constant with respect to $\theta$ ), resulting in a trivial integration with any modern autodiff system.

From an implementation perspective:

$\varphi$ may attain large values far from the boundary. Clamping or normalizing $\varphi$ (e.g., by the image size) is advised for numerical stability.
Efficient generation of signed distance maps leverages existing libraries on either 2D slices or entire 3D volumes, with negative signs assigned to the interior of each ground-truth region.
The computational cost of evaluating boundary loss is negligible ($1$–$2$ ms per batch), markedly more efficient than alternatives like the Hausdorff loss (which incurs $\sim$ 10% slow-down over typical region losses).

4. Augmentation with Regional Losses and Scheduling Strategies

When used in isolation, boundary loss alone often yields trivial solutions (e.g., empty foreground, $s_\theta \approx 0$ everywhere), as the gradients vanish rapidly. Therefore, practical usage always augments a regional loss $L_R$ with the boundary term, leading to a composite loss:

$L(\theta) = L_R(\theta) + \alpha \cdot L_B(\theta)$

where $\alpha$ is a tunable scalar. Effective choices for $L_R$ include:

Generalized Dice Loss (GDL, hyperparameter-free for binary cases)
Weighted Cross-Entropy (as in U-Net)
Focal loss ( $\gamma=2$ )
Karimi’s Hausdorff loss (for comparison)

To balance the influence of boundary and regional terms during training, a “rebalance” schedule is recommended: start with a small $\alpha_0$ (e.g., $0.01$), increasing by increments (e.g., $+0.01$ per epoch) until reaching a target $\alpha$ ( $0.5 \sim 1.0$ ), so early optimization is guided by $L_R$ , with the boundary term taking over as learning progresses.

5. Empirical Results and Comparative Performance

Boundary loss was evaluated on highly unbalanced medical segmentation datasets:

ISLES Ischemic Stroke Lesion: 94 scans; strong foreground-background imbalance.
WMH White-Matter Hyperintensities: 60 scans; similar class imbalance.

The following summarizes results (ISLES, using GDL as $L_R$ ):

Loss Type	Dice Coefficient (DSC)	95%-Hausdorff (HD95, mm)
GDL only	≈0.51	≈5.32
+ Boundary (2D)	0.64 (+13%)	4.80
+ Boundary (3D)	0.66	2.72

Similar, though smaller, improvements were observed when using cross-entropy or focal loss as $L_R$ . MBS loss promoted better recovery of small lesions, yielded sharper boundaries, and reduced false positives, while incurring a negligible computational overhead compared to Hausdorff loss-based approaches.

6. Practical Considerations and Significance

The MBS loss is straightforward to implement due to its linear formulation and closed-form gradients. Its main strengths are:

Robust handling of class imbalance, since its integrals are over the interface region rather than the full segmentation mask.
Easy integration with standard architectures and training pipelines.
Consistent empirical improvement in both region- and boundary-based metrics, with added stability during training in highly unbalanced scenarios.
Minimal additional computational cost.

A plausible implication is that, for highly imbalanced segmentation tasks (particularly prevalent in medical imaging), boundary loss offers a principled and efficient augmentation to existing region-based loss functions, enhancing both the quantitative and qualitative accuracy of neural network-based segmentation systems (Kervadec et al., 2018).

Markdown Report Issue Upgrade to Chat

References (1)

Boundary loss for highly unbalanced segmentation (2018)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Minimum Boundary Shift (MBS) Loss.

Minimum Boundary Shift (MBS) Loss

1. Mathematical Definition and Contour Distance

2. Reformulation as Regional Integral on the Softmax Output

3. Differentiability, Computational Characteristics, and Implementation

4. Augmentation with Regional Losses and Scheduling Strategies

5. Empirical Results and Comparative Performance

6. Practical Considerations and Significance

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

Minimum Boundary Shift (MBS) Loss

1. Mathematical Definition and Contour Distance

2. Reformulation as Regional Integral on the Softmax Output

3. Differentiability, Computational Characteristics, and Implementation

4. Augmentation with Regional Losses and Scheduling Strategies

5. Empirical Results and Comparative Performance

6. Practical Considerations and Significance

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research