Boundary-to-Region (B2R): Core Concepts

Updated 4 July 2026

Boundary-to-Region (B2R) is a modeling approach that employs boundary cues as primary signals to restrict and enhance region-level inference.
B2R frameworks improve tasks like dense semantic segmentation and safe reinforcement learning by converting sparse boundary information into robust, contextual region supervision.
Its diverse applications—from image segmentation to urban analytics—demonstrate how tailored boundary-conditioned processing can yield significant performance gains.

Searching arXiv for the cited B2R-related papers and nearby work to ground the article. Boundary-to-Region (B2R) denotes a class of formulations in which a boundary, or a boundary-like constraint, is treated as the primary organizer of region-level inference. In the cited literature, this idea appears in several technically distinct forms: detected object boundaries act as the source of contextual signals aggregated into interior pixels in semantic segmentation; learned semantic boundaries gate long-range propagation within segments; boundary evidence is converted into separated instance proposals; signed distance maps become region-level supervision weights; shadow, boundary, and non-shadow content are reordered before state-space scanning; cost-to-go is reinterpreted as a fixed safety boundary in offline safe reinforcement learning; and user-specified geographic boundaries are transformed into token sets and induced subgraphs for elastic region representation (Ma et al., 2021, Ding et al., 2019, Chen et al., 2020, Valverde et al., 2021, Zhu et al., 2024, Su et al., 30 Sep 2025, Zhu et al., 11 Mar 2025). The term therefore names a recurrent operational pattern rather than a single algorithm.

1. Conceptual scope

Across these works, B2R consistently replaces undifferentiated global processing with a boundary-conditioned restriction on where information originates, how it propagates, or which samples supervise the model. In dense prediction, the restriction is spatial; in restoration, it is a sequence permutation; in safe RL, it is a conditioning semantics; in urban representation, it is a geometry-to-subgraph map; and in mathematical or mechanical settings, it concerns how boundaries determine feasible regions or their projections.

Domain	Boundary signal	Region-level operation
Semantic segmentation	Detected object boundaries	Context aggregation into interior pixels
Boundary-aware propagation	Learned semantic boundary probability	Gated feature diffusion within segments
Nucleus segmentation	Instance boundary probability map	Connected-component proposals after boundary subtraction
Region-wise supervision	Signed distance or boundary-derived maps	Per-pixel, per-class loss weighting
Shadow removal	Shadow mask and boundary windows	Region-typed sequence reordering for SSM scanning
Offline safe RL	Fixed safety budget boundary	Region-wide supervision over safe trajectories
Urban representation	Prompted geometry	Token-set extraction and subgraph embedding

This diversity also clarifies that “boundary” is not always a geometric contour. In BCANet and BFP, it is a semantic edge field; in ShadowMamba, a window containing both mask values is a “boundary window”; in offline safe RL, the boundary is the deployment-time cost budget $\kappa$ ; and in BPURF, the boundary is a prompted polygon or related geometry (Ma et al., 2021, Ding et al., 2019, Zhu et al., 2024, Su et al., 30 Sep 2025, Zhu et al., 11 Mar 2025).

2. Boundary-guided dense prediction

In "Boundary Guided Context Aggregation for Semantic Segmentation" (Ma et al., 2021), B2R is defined explicitly as the use of detected object boundaries as the source of contextual signals aggregated into interior pixels. BCANet consists of a Multi-Scale Boundary (MSB) extractor and a Boundary guided Context Aggregation (BCA) module. MSB taps the last residual block of each backbone stage in a dilated ResNet-101, unifies channels to $256$, resizes all scales to $1/8$ input resolution, and predicts a binary boundary map with a $1\times 1$ convolution and Sigmoid. BCA then performs a Non-local-style cross-stream attention in which semantic features $A$ and boundary features $B$ are projected to $A_1,B_1\in\mathbb{R}^{C\times N}$ and used to form a boundary-semantic affinity

$F(i,j)=\frac{\exp(B_{1i}^\top A_{1j})}{\sum_{i'=1}^{N}\exp(B_{1i'}^\top A_{1j})},$

followed by the residual update

$D_j = A_j + \sum_{i=1}^{N}F(i,j)\,A_{2i}.$

The intended effect is that attention weights concentrate on boundary positions and push boundary-informed context into inner pixels, thereby increasing intra-class consistency and reducing inter-class confusion. On Cityscapes, relative to a dilated FCN baseline, BCANet improves interior F-score from $75.07$ to $256$0 and boundary F-score from $256$1 to $256$2; BCANet with SegFix reaches boundary F-score $256$3 and mIoU $256$4. On the Cityscapes test set, BCANet with ResNet-101 reports $256$5 mIoU, and on ADE20K validation it reports $256$6 mIoU and $256$7 pixel accuracy (Ma et al., 2021).

A related but architecturally distinct formulation appears in "Boundary-Aware Feature Propagation for Scene Segmentation" (Ding et al., 2019). Here the network learns “boundary” as an additional semantic class, producing a boundary confidence map $256$8 from an $256$9-class softmax. B2R is implemented through a multiplicative propagation gate rather than an attention map. The boundary confidence is converted to a propagation confidence

$1/8$0

with $1/8$1, $1/8$2, and learnable $1/8$3, and feature propagation along Unidirectional Acyclic Graphs is updated as

$1/8$4

Messages inside a region are therefore passed strongly, while messages that would cross a boundary are attenuated. The paper replaces expensive DAG-style pixel scans with UAGs that reduce the number of sequential steps from $1/8$5 to $1/8$6 and reports nearly identical mIoU to DAGs on PASCAL-Context under matched settings. Quantitatively, BFP reports $1/8$7 mIoU on PASCAL-Context, $1/8$8 mIoU on CamVid, and $1/8$9 mIoU on Cityscapes test, with large gains over FCN baselines and further gains from boundary-aware gating (Ding et al., 2019).

Taken together, these two papers establish the main dense-prediction reading of B2R: boundaries are not merely auxiliary edge targets, but explicit regulators of where region semantics should be gathered from.

3. From boundaries to instances, losses, and variational contours

In crowded nucleus segmentation, "Boundary-assisted Region Proposal Networks for Nucleus Segmentation" (Chen et al., 2020) realizes B2R by converting boundary evidence into separated instance regions before refinement. BRP-Net predicts both a semantic nucleus map $1\times 1$ 0 and an instance boundary map $1\times 1$ 1 using a Task-aware Feature Encoding network with task-specific streams and residual Feature Fusion Modules. Proposal generation follows a contour-subtraction rule: threshold the two maps, set boundary pixels in the semantic map to zero, and compute connected components of the remaining foreground. This produces disjoint coarse proposals $1\times 1$ 2, which are then refined by proposal-wise instance segmentation networks. The paper stresses that the original nomenclature is “Boundary-assisted Region Proposal,” but also states that this procedure is conceptually a B2R process. BRP-Net reports on Kumar an AJI of $1\times 1$ 3 and F1 of $1\times 1$ 4, and on CPM17 Dice1/Dice2/AJI of $1\times 1$ 5, while emphasizing robustness to the dilation-radius hyper-parameter that affects contour-based splitting (Chen et al., 2020).

A more abstract conversion of boundary information into region supervision appears in "Region-wise Loss for Biomedical Image Segmentation" (Valverde et al., 2021). The Region-wise loss is

$1\times 1$ 6

where $1\times 1$ 7 is a per-pixel, per-class RW map derived from labels alone. Under this framework, Active Contour and Boundary loss can be reformulated by appropriate choices of $1\times 1$ 8. For boundary-derived supervision, naive signed-distance maps are shown to induce optimization instability because gradient signs depend on differences $1\times 1$ 9 and can produce multiple negative components in $A$ 0. The paper introduces a rectification principle: for a pixel whose true class is $A$ 1, all non-true channels should share the same value, and the true-class weight should satisfy $A$ 2. Its rectified Region-wise map sets outside values to $A$ 3 and inside values to normalized negative distances, yielding bounded, sign-correct gradients without auxiliary regularization or schedules. Experiments on ACDC17, BraTS18, and KiTS19 show state-of-the-art or comparable Dice and Hausdorff performance, and the convergence study reports that RRW eliminates the heavy tail of failed runs observed with unrectified boundary maps (Valverde et al., 2021).

" A Region-based Randers Geodesic Approach for Image Segmentation" (Chen et al., 2019) takes yet another route. Instead of using boundaries only to define an eikonal speed, it injects region homogeneity into a Randers metric

$A$ 4

where the drift term $A$ 5 is derived from the region term of an active-contour functional. This transforms the minimization of a region-and-boundary energy into the solution of a Randers eikonal PDE. The resulting interactive pipeline estimates closed contours by concatenating piecewise minimal geodesics inside a tube around the current boundary. The paper presents this as a way to prevent shortcutting through interiors when edge cues alone are weak or cluttered. Although the paper does not use the term B2R explicitly, its formulation is a boundary-to-region segmentation engine in the precise sense that regional appearance influences boundary geodesics throughout the tube (Chen et al., 2019).

4. Sequence ordering and boundary-aware scanning

In low-level restoration, "ShadowMamba: State-Space Model with Boundary-Region Selective Scan for Shadow Removal" (Zhu et al., 2024) defines B2R as a mask-driven sequence reordering mechanism for a Mamba state-space model. Given a binary shadow mask $A$ 6, the image is partitioned into non-overlapping windows of size $A$ 7 with $A$ 8, and each window is typed by

$A$ 9

This yields non-shadow, boundary, and shadow window sets, which are concatenated in the order non-shadow $B$ 0 boundary $B$ 1 shadow. The resulting permutation shortens sequence distances among same-type pixels, especially boundary pixels, before horizontal, vertical, reverse-horizontal, and reverse-vertical scans. ShadowMamba embeds this BRSSM branch inside a U-Net with a global GSSM branch, a CSSM channel branch, and an Efficient Feed-Forward Network.

The central claim is not merely that masks help, but that the specific boundary-region ordering improves semantic continuity and local coherence along shadow boundaries, where brightness discontinuities are most abrupt. This is reflected in the ablation results on ISTD: GSSM alone gives RMAE $B$ 2, adding BRSSM reduces it to $B$ 3, adding CSSM gives $B$ 4, and adding EFFN gives $B$ 5. A scan-type ablation further reports RMAE $B$ 6 for a mask scan without local scan, $B$ 7 for a local scan without region separation, and $B$ 8 for the boundary-region scan. The full model reports on SRD ALL RMAE $B$ 9, PSNR $A_1,B_1\in\mathbb{R}^{C\times N}$ 0, SSIM $A_1,B_1\in\mathbb{R}^{C\times N}$ 1; on ISTD ALL RMAE $A_1,B_1\in\mathbb{R}^{C\times N}$ 2, PSNR $A_1,B_1\in\mathbb{R}^{C\times N}$ 3, SSIM $A_1,B_1\in\mathbb{R}^{C\times N}$ 4; and on ISTD+ RMAE $A_1,B_1\in\mathbb{R}^{C\times N}$ 5 for ALL/S/NS (Zhu et al., 2024).

This version of B2R is notable because the “region” is not produced by spatial aggregation after attention or propagation. It is produced first by a permutation, and only then processed by a linear-time SSM. The paper therefore treats B2R as an ordering prior for long-sequence modeling rather than as a message-passing mask (Zhu et al., 2024).

5. Boundary-to-Region in offline safe reinforcement learning

"Boundary-to-Region Supervision for Offline Safe Reinforcement Learning" (Su et al., 30 Sep 2025) shifts B2R from geometry to decision-making. Its starting point is the asymmetry between return-to-go and cost-to-go in a constrained MDP:

$A_1,B_1\in\mathbb{R}^{C\times N}$ 6

The paper argues that RTG is a flexible target, whereas CTG is a rigid feasibility boundary. Standard DT-style sequence models condition symmetrically on RTG and CTG tokens, which the paper identifies as a source of brittle token selection and sparse, near-boundary supervision.

B2R resolves this by filtering to safe trajectories,

$A_1,B_1\in\mathbb{R}^{C\times N}$ 7

and then realigning cost-to-go so that every safe trajectory is conditioned on the same deployment-time boundary token:

$A_1,B_1\in\mathbb{R}^{C\times N}$ 8

which guarantees $A_1,B_1\in\mathbb{R}^{C\times N}$ 9 while preserving the temporal cost-decay profile. Training then uses the usual autoregressive behavior-cloning form, but with realigned CTG tokens and RoPE. Conceptually, boundary-only supervision over trajectories near the budget is replaced by region-wide supervision over the entire safe region under a fixed boundary condition.

The paper also states formal guarantees. Under Safe-Aligned Data and Prediction-Error Bound assumptions, it proves

$F(i,j)=\frac{\exp(B_{1i}^\top A_{1j})}{\sum_{i'=1}^{N}\exp(B_{1i'}^\top A_{1j})},$ 0

and

$F(i,j)=\frac{\exp(B_{1i}^\top A_{1j})}{\sum_{i'=1}^{N}\exp(B_{1i'}^\top A_{1j})},$ 1

It further proves reward dominance over boundary-only supervision when the dataset contains an optimal safe trajectory strictly inside the feasible region. Empirically, B2R is evaluated on $F(i,j)=\frac{\exp(B_{1i}^\top A_{1j})}{\sum_{i'=1}^{N}\exp(B_{1i'}^\top A_{1j})},$ 2 DSRL tasks and is reported to satisfy safety in $F(i,j)=\frac{\exp(B_{1i}^\top A_{1j})}{\sum_{i'=1}^{N}\exp(B_{1i'}^\top A_{1j})},$ 3 out of $F(i,j)=\frac{\exp(B_{1i}^\top A_{1j})}{\sum_{i'=1}^{N}\exp(B_{1i'}^\top A_{1j})},$ 4 tasks while achieving the highest reward in $F(i,j)=\frac{\exp(B_{1i}^\top A_{1j})}{\sum_{i'=1}^{N}\exp(B_{1i'}^\top A_{1j})},$ 5 tasks. The paper attributes remaining failures in CarCircle1/2 and AntCircle to dataset limitations, specifically the tight coupling of high reward and high cost under scarce safe data (Su et al., 30 Sep 2025).

This formulation is a strong reminder that B2R need not involve images at all. Here the “boundary” is a rigid budget, and the “region” is the set of all safe trajectories consistent with that budget token.

6. Boundary-defined regions in urban representation, mechanics, and graph theory

In "Boundary Prompting: Elastic Urban Region Representation via Graph-based Spatial Tokenization" (Zhu et al., 11 Mar 2025), B2R is the transformation of a user-specified geometry into an elastic region embedding. A boundary $F(i,j)=\frac{\exp(B_{1i}^\top A_{1j})}{\sum_{i'=1}^{N}\exp(B_{1i'}^\top A_{1j})},$ 6 induces a token set

$F(i,j)=\frac{\exp(B_{1i}^\top A_{1j})}{\sum_{i'=1}^{N}\exp(B_{1i'}^\top A_{1j})},$ 7

which is extracted by querying an R-tree over spatial tokens and then expanding to adjacent virtual tokens through a spatial-virtual index. The induced subgraph is embedded by type-wise SUM aggregation followed by CONCAT, and then updated by multi-channel message passing over structure, position, and neighbor relations. The paper’s online extraction algorithm runs in $F(i,j)=\frac{\exp(B_{1i}^\top A_{1j})}{\sum_{i'=1}^{N}\exp(B_{1i'}^\top A_{1j})},$ 8 time, where $F(i,j)=\frac{\exp(B_{1i}^\top A_{1j})}{\sum_{i'=1}^{N}\exp(B_{1i'}^\top A_{1j})},$ 9 is the number of spatial tokens inside the queried boundary and $D_j = A_j + \sum_{i=1}^{N}F(i,j)\,A_{2i}.$ 0 is the average number of virtual tokens linked to each spatial token. On dynamic-region tasks in NYC and CHI, BPURF reports, for example, MAE/RMSE/ $D_j = A_j + \sum_{i=1}^{N}F(i,j)\,A_{2i}.$ 1 of $D_j = A_j + \sum_{i=1}^{N}F(i,j)\,A_{2i}.$ 2 on NYC crime and $D_j = A_j + \sum_{i=1}^{N}F(i,j)\,A_{2i}.$ 3 on CHI crash, and it reports $D_j = A_j + \sum_{i=1}^{N}F(i,j)\,A_{2i}.$ 4 to $D_j = A_j + \sum_{i=1}^{N}F(i,j)\,A_{2i}.$ 5 subgraph-extraction speedups over naive extraction (Zhu et al., 11 Mar 2025).

A different boundary-to-region notion appears in "Regions of possible motion in mechanical systems" (Kharlamov, 2014). There, the region of possible motion is the projection $D_j = A_j + \sum_{i=1}^{N}F(i,j)\,A_{2i}.$ 6 of an integral manifold onto a lower-dimensional configuration space, and the generalized boundary is the visible contour of the projection. The paper states the criterion

$D_j = A_j + \sum_{i=1}^{N}F(i,j)\,A_{2i}.$ 7

where $D_j = A_j + \sum_{i=1}^{N}F(i,j)\,A_{2i}.$ 8 is the restriction of the differential of the integrals to the fiber direction. In the Euler-Zhukovsky gyrostat example, the generalized boundary partitions the Poisson sphere into components with constant admissible-velocity fiber type, and this information reconstructs the phase topology. Away from the bifurcation set, the integral manifolds are classified as $D_j = A_j + \sum_{i=1}^{N}F(i,j)\,A_{2i}.$ 9, $75.07$0, or empty, while singular parameter values yield circles and products involving a figure-eight curve (Kharlamov, 2014).

In extremal graph theory, "On the boundary of the region defined by homomorphism densities" (Hatami et al., 2016) studies the feasible region $75.07$1 of tuples of graph homomorphism densities. For the classical edge-triangle case, the boundary is a countable union of algebraic curves and is almost everywhere differentiable. The paper then constructs finitely forcible lexicographic families for which a boundary restriction along certain hyperplanes becomes nowhere differentiable. Here B2R is not an algorithmic framework but an analytic principle: the geometry of a feasible region is governed by the complexity of the extremal boundary configurations that realize it (Hatami et al., 2016).

These three papers show that the boundary-to-region idea can denote prompt-conditioned region construction, projection-defined feasibility, or the analytic structure of a density region, depending on the mathematical setting.

7. Common principles, misconceptions, and open issues

Several common principles recur across the literature. First, B2R typically restricts the source of useful information. BCANet replaces full-map self-attention with cross-stream attention keyed by boundary features, and BFP converts semantic boundaries into multiplicative message gates rather than allowing unconstrained propagation (Ma et al., 2021, Ding et al., 2019). Second, B2R often densifies supervision. In safe RL, sparse supervision near the safety boundary is replaced by all safe trajectories realigned to a fixed boundary token; in Region-wise loss, sparse boundary distances become full region-weight maps; in BRP-Net, thin separating contours become instance-level connected components (Su et al., 30 Sep 2025, Valverde et al., 2021, Chen et al., 2020). Third, B2R is compatible with both explicit and implicit restrictions: BCANet states that the implemented BCA achieves the boundary-to-region effect implicitly without a hard mask in the forward pass, whereas BRP-Net uses explicit thresholding and subtraction, and ShadowMamba uses an explicit region-wise permutation (Ma et al., 2021, Chen et al., 2020, Zhu et al., 2024).

Several misconceptions are addressed directly by the papers. B2R is not synonymous with “boundary refinement”: GSCNN-like sharpening is contrasted with BCANet’s use of boundary features as keys for interior context aggregation (Ma et al., 2021). Nor does B2R necessarily require closed contours: BCANet is presented as more robust than BFP when boundaries are not closed because its attention concentrates on boundary positions without requiring a closed contour, whereas BFP can over-smooth and leak across classes when boundaries are missed (Ma et al., 2021, Ding et al., 2019). It is also not restricted to papers that explicitly use the term. BRP-Net’s proposal generation and the Randers geodesic model are described in the source material as conceptual B2R realizations even though that terminology is not their original nomenclature (Chen et al., 2020, Chen et al., 2019).

The main open issues are similarly recurrent. Performance is often sensitive to boundary quality: BCANet notes noisy boundaries, thin structures, and class-agnostic leakage; BFP notes missed or spurious boundaries and sensitivity to boundary-band width; ShadowMamba depends on binary masks and may blur very thin boundary structures at window level; BRP-Net remains challenged by blurry or severely overlapping nuclei; and BPURF depends on rich token coverage and tuned spatial augmentation and top-$75.07$2 settings (Ma et al., 2021, Ding et al., 2019, Zhu et al., 2024, Chen et al., 2020, Zhu et al., 11 Mar 2025). In safe RL, B2R improves over boundary-only supervision but still degrades when feasible trajectories under the deployment budget are extremely scarce or when severe out-of-distribution deployment breaks the prediction-error assumptions (Su et al., 30 Sep 2025). In the graph-theoretic setting, the edge-triangle case is well behaved, but the paper explicitly leaves open whether almost-everywhere differentiability persists for broader graph families (Hatami et al., 2016).

A plausible implication is that B2R is best understood as a modeling discipline: define a boundary signal that captures feasibility or separation, then use it to constrain region-level computation, supervision, or representation more tightly than generic global processing would. The surveyed papers differ sharply in modality and formalism, but they converge on that operational pattern.