DRBD-Mamba: Dual-Resolution 3D Segmentation
- DRBD-Mamba is a 3D deep learning architecture that integrates dual-resolution processing with bi-directional state space modeling to enhance medical image segmentation.
- It employs innovative techniques such as Morton indexing, adaptive gated fusion, and vector quantization to preserve spatial topology and boost computational efficiency.
- Evaluations on benchmarks like BraTS2023 demonstrate improved Dice accuracy and up to 15× reduction in computational cost compared to existing models.
Dual-Resolution Bi-Directional Mamba (DRBD-Mamba) refers to a class of 3D deep learning architectures that unify multi-scale spatial representation and efficient long-range dependency modeling within State Space Model (SSM) frameworks, particularly for medical image segmentation. DRBD-Mamba variants are characterized by dual-resolution encoder–decoder structures, adaptive bi-directional state transitions, space-filling curve indices for spatial locality preservation, and context-aware feature fusion mechanisms. These models seek to combine the superior global context modeling capacity of Mamba SSMs with computationally efficient, anatomically coherent handling of spatial topology and multi-scale features.
1. Architectural Foundations
DRBD-Mamba architectures implement dual-resolution processing: high-resolution stages preserve fine local detail, while low-resolution stages enable efficient extraction of long-range global dependencies. The encoder typically consists of stacked 3D convolutional layers, with Mamba modules strategically inserted at bottleneck locations and in skip connections. Bi-directional Mamba blocks are applied at two critical scales: after aggressive spatial downsampling (for global context) and at intermediate/high resolutions via skip path transformation, allowing both local and global feature propagation without excessive computational expense (Ali et al., 16 Oct 2025).
To flatten volumetric grid data for SSM processing, the model employs a space-filling curve—specifically the Morton (Z-order) indexing. This mapping transforms 3D tensor coordinates into 1D sequences while maintaining spatial neighborhood properties:
where denote the binary bits of spatial position and is the bit-shift operation (Ali et al., 16 Oct 2025). This preserves local anatomical coherency during 1D Mamba sequence processing and avoids padding overhead required by dyadic Hilbert curves.
2. Bi-Directional Mamba Blocks and Context Fusion
In contrast to standard unidirectional Mamba SSMs, DRBD-Mamba utilizes bi-directional blocks, processing each 1D sequence in both forward and reverse directions via input-dependent state recurrences:
- Forward:
- Reverse:
with output projection , where and are input-dependent (Ali et al., 16 Oct 2025).
A learnable gated fusion mechanism adaptively integrates outputs from both directions on a per-channel basis. Gating weights are computed via a sigmoid function with learnable parameters , yielding the fused representation:
Here, denotes element-wise multiplication, enabling dynamic weighting between contextual streams for each channel.
3. Multi-Scale Feature Handling and Efficient Decoding
Dual-resolution encoding enables both fine and coarse spatial topologies to be represented. Context-rich features from bottleneck-level bi-directional Mamba modules are combined with anatomically detailed features from skip connections, themselves transformed by Mamba blocks at intermediate resolution (e.g., ) (Ali et al., 16 Oct 2025).
Decoding involves multi-scale fusion: features from all encoder stages are linearly projected, resampled, and passed through further Mamba-driven fusion modules, followed by 3D upsampling to restore full resolution. This facilitates robust spatial detail recovery in segmentation outputs while maintaining global consistency.
4. Feature Robustness via Quantization
After gated fusion, feature representations are discretized through a vector quantization (VQ) module. Using a codebook and an encoded feature , quantization assigns:
This enforces features to reside near prototype embeddings, increasing noise robustness and reducing overfitting. The quantization is applied to output sequences post-Mamba processing, ensuring that context modeling does not drift during training (Ali et al., 16 Oct 2025).
5. Computational Efficiency and Throughput
DRBD-Mamba achieves significant efficiency gains by confining expensive long-range dependency modeling to minimal, well-chosen sites within the network (bottleneck and a deep skip connection), and by leveraging space-filling curve mapping to minimize padded and fragmented computations. For example, the use of Morton ordering rather than Hilbert curves avoids tensor padding overhead for arbitrary grid sizes and streamlines mapping operations.
Reported efficiency metrics include up to improvement in computational cost compared to full multi-axial scan Mamba models, while maintaining high segmentation accuracy. This efficiency enables DRBD-Mamba to scale to large 3D volumes and real-time clinical deployments (Ali et al., 16 Oct 2025).
6. Comparative Performance and Evaluation Protocols
On the BraTS2023 brain tumor segmentation benchmark, DRBD-Mamba attains Dice improvements of for whole tumor, for tumor core, and for enhancing tumor when evaluated on the test set commonly used in recent literature. Systematic five-fold evaluations—partitioned by average tumor intensity—demonstrate that DRBD-Mamba maintains competitive whole tumor accuracy with clear Dice gains of for tumor core and for enhancing tumor over existing state-of-the-art (Ali et al., 16 Oct 2025).
Analyses of failure modes indicate that very small tumor volumes pose substantial challenges, particularly for enhancing tumor regions. The DRBD-Mamba context-aware design and quantization mitigate but do not entirely eliminate the decline in these scenarios.
Comparative studies with 3D CNNs (e.g., U-Net), transformer-based architectures (UNETR, SwinUNETR), and earlier Mamba-based models demonstrate both superior Dice scores (especially for challenging sub-regions) and reduced computational requirements.
7. Relationship to Related Mamba Variants
LBMamba (Zhang et al., 19 Jun 2025) introduces an efficient locally bi-directional SSM block, which could plausibly support dual-resolution architectures by alternating or parallelizing scan directions at different spatial scales. The core innovation—embedding a backward scan at the thread-register level—delivers bidirectional context without global backward sweeps or memory overhead, and could be adapted in DRBD-Mamba frameworks to further improve performance-throughput trade-offs.
DM-SegNet’s quadri-directional spatial Mamba (Ji, 5 Jun 2025) exploits multi-directional global context extraction and gated spatial convolution to preserve anatomical features, and shares architectural principles—such as multi-scale feature fusion and bidirectional decoding—with DRBD-Mamba.
Conclusion
Dual-Resolution Bi-Directional Mamba unifies efficient multi-scale modeling and robust global context integration in 3D medical image analysis through dual-resolution encoder–decoder frameworks, adaptive bi-directional SSM blocks, spatial locality-preserving sequence mapping, gated fusion, and quantization. These technical innovations yield models that outperform contemporary baselines in terms of Dice accuracy, computational efficiency, and robustness to data heterogeneity—substantiated by systematic clinical benchmark evaluations (Ali et al., 16 Oct 2025, Ji, 5 Jun 2025, Zhang et al., 19 Jun 2025).
Sponsored by Paperpile, the PDF & BibTeX manager trusted by top AI labs.
Get 30 days free