Dual-Branch Residual Network (DB-ResNet)
- Dual-Branch Residual Network (DB-ResNet) is a deep neural network architecture featuring two parallel branches that capture complementary features to boost representation power.
- It employs design strategies like spatial-spectral fusion, resolution-parallel processing, and dual residual blocks to enhance feature extraction across various modalities.
- DB-ResNet has demonstrated improvements in metrics such as PSNR and Dice scores in applications including HDR imaging, hyperspectral classification, and CT segmentation.
A Dual-Branch Residual Network (DB-ResNet) is a class of deep neural network architecture that extends the conventional residual network paradigm by integrating two parallel computational pathways (branches or streams), each emphasizing different types or granularities of feature extraction. Originating as a generalization and diversification of the residual connection concept, DB-ResNet designs have found broad application across computer vision tasks including, but not limited to, image restoration, high dynamic range (HDR) imaging, hyperspectral image classification, and medical image segmentation. These architectures are typified by simultaneous learning along distinct branches—such as high/low resolution, spatial/spectral, or multi-view/multi-scale features—with inter- or post-branch fusion to enhance representational power and robustness while maintaining or reducing computational cost.
1. Core Architectural Principles of DB-ResNet
The defining principle of DB-ResNet architectures is the parallel execution of two branches, each comprising a stack of convolutional (and often residual) blocks, with each branch directed towards capturing complementary aspects of the input. The fusion of these branches—either at the feature or decision level—enables the network to integrate fine-grained and contextual information.
Several canonical instantiations include:
- RiR (ResNet-in-ResNet): Two streams—a residual stream with identity shortcut and a transient stream without shortcut—are coupled in each block. Forward pass equations are:
where each is a learnable convolution and denotes batch normalization followed by ReLU. Only the residual branch carries explicit identity shortcuts (Targ et al., 2016).
- Spatial-Spectral or Multi-Scale Design: One branch focuses on extracting spatial or multi-view features (e.g., larger receptive fields, different patch sizes), while the second branch extracts spectral, scale, or intensity-based features. Examples include the spatial/spectral fusion in hyperspectral classification (Qin et al., 27 Apr 2025) and multi-view/multi-scale pathways in CT imaging (Cao et al., 2019).
- Resolution-Parallel Fusion: A full-resolution branch preserves high-frequency detail through operations such as deformable convolutions, while a low-resolution branch attends to global context using spatial or channel attention modules before fusion (Marín-Vega et al., 2022, Wu et al., 2023).
- Dual Residual Block (DuRN): Each block contains two distinct paired operations, and , each with its own residual path, yielding
which enables rich long-range interactions across network depth (Liu et al., 2019).
2. Detailed Layer Design and Mathematical Formulation
Within DB-ResNet architectures, both branches follow carefully designed convolutional and residual block sequences, often tailored per task.
- Encoding and Feature Extraction: Each branch ingests domain-specific encoded variants of the input (e.g., LDR + gamma-corrected HDR for bracketed images (Marín-Vega et al., 2022), or differently shaped spatial patches for CT (Cao et al., 2019), or initial 1×1 mapping convolutions for spectral data (Qin et al., 27 Apr 2025)).
- Branch-Specific Blocks:
- Deformable Convolutions: Used for spatial alignment in full-resolution HDR imaging branches (Marín-Vega et al., 2022).
- Dilated Residual Dense Blocks (DRDB): Densely connected stacks with dilation to increase receptive field (Marín-Vega et al., 2022).
- Attention Modules: Spatial attention for low-resolution suppression of ghosting, channel attention (SE) for discriminative feature enhancement, as in DRANet (Wu et al., 2023) and DuRN-S (Liu et al., 2019).
- Hybrid Blocks: In denoising, hybrid dilated residual attention blocks (HDRAB) facilitate broad contextual modeling via dilation rates and channel attention (Wu et al., 2023).
- Fusion Mechanisms: Fusion methods include concatenation followed by convolutional reduction (Marín-Vega et al., 2022, Wu et al., 2023, Qin et al., 27 Apr 2025), averaging (Cao et al., 2019), or more complex attention-based fusions.
3. Loss Functions and Optimization
DB-ResNet models employ loss functions tailored to the structural targets and domain characteristics:
- Standard Regression/Classifcation Losses:
- or MSE for HDR prediction and denoising (Marín-Vega et al., 2022, Wu et al., 2023).
- Voxel-wise cross-entropy for segmentation (Cao et al., 2019).
- Prototypical classification and contrastive refinement for few-shot learning (Qin et al., 27 Apr 2025).
- Specialized Norms and Transformations:
- -law tonemapping after normalization for HDR rendering (Marín-Vega et al., 2022):
Domain Adaptation Terms:
- MMD loss for cross-domain alignment in hyperspectral learning (Qin et al., 27 Apr 2025).
- Contrastive and Cluster-aware Penalties:
- Query-prototype contrastive loss to refine prototype location and improve intra-class compactness/inter-class separability (Qin et al., 27 Apr 2025).
4. Application Domains and Task-Specific Instantiations
DB-ResNet variants have been instantiated for a wide spectrum of tasks:
| Domain | Key Architectural Distinction | Representative Paper |
|---|---|---|
| Multi-bracket HDR imaging | Full/low-res dual-branch; deform conv, SA | DRHDR (Marín-Vega et al., 2022) |
| Image denoising (real & synthetic) | RAB/HDRAB dual-branch, spatial/channel attn | DRANet (Wu et al., 2023) |
| Few-shot hyperspectral classification | Spatial/spectral dual-branch, QPL, MMD | (Qin et al., 27 Apr 2025) |
| CT lung nodule segmentation | Multi-view & multi-scale branches, CIP | (Cao et al., 2019) |
| General image restoration/translation | Paired F/G ops in dual residual block | DuRN (Liu et al., 2019) |
| Generalized ResNet backbone | Transient/residual streams, cross-conv | RiR (Targ et al., 2016) |
For each, DB-ResNet outperforms or matches prior baselines, typically with competitive or reduced parameter count and complexity. Examples include a dice of 82.74% for nodule segmentation (0.5% above radiologist consensus, (Cao et al., 2019)), PSNR gain of up to 0.42 dB for HDR fusion while reducing GMACs by 30% (Marín-Vega et al., 2022), and 8.6% OA improvement in few-shot hyperspectral settings (Qin et al., 27 Apr 2025).
5. Ablation Studies and Design Insights
Empirical ablations universally confirm the merit of dual-branch design:
- Branching Efficacy: Ablating the second branch (whether scale, view, or domain) consistently reduces performance by at least 0.3–8% OA or dice (Qin et al., 27 Apr 2025, Cao et al., 2019).
- Attention and Fusion Location: Proper placement of spatial/channel attention and branch fusion points optimizes accuracy and efficiency. For instance, early CIP improves segmentation and a single fusion location over distributed fusion performs best (Cao et al., 2019).
- Hybrid Kernelization: Use of asymmetric or dilated kernels in different branches increases receptive field and directionality with minimal FLOP budget (Wu et al., 2023, Qin et al., 27 Apr 2025).
- Parameter and Computational Efficiency: Dual-path schemes often lead to marked reductions in effective GMACs and runtime (e.g., 25–45% reduction vs. monolithic baselines in HDR, 30–40% in denoising), while jointly improving reconstruction or classification metrics (Marín-Vega et al., 2022, Wu et al., 2023).
A plausible implication is that DB-ResNet achieves a favorable tradeoff between expressivity and parameter/compute cost due to their architectural modularity and complementary feature extraction.
6. Generalization and Limitations
DB-ResNet architectures generalize a broad class of existing models:
- If cross-branch couplings are disabled or weights are zeroed, DB-ResNet reduces to a plain CNN. If only identity shortcuts are active, it reduces to a standard ResNet (Targ et al., 2016).
- The dual-residual block permits explosive growth in implicit sub-networks, surmounting expressivity limitations of single-residual cascades (Liu et al., 2019).
- Over-parameterization, excessive block depth, or poorly placed branch fusion can lead to overfitting or reduced generalization, as shown empirically in lung nodule segmentation where increasing backbone depth degraded performance (Cao et al., 2019).
7. Quantitative Performance and Benchmark Summary
DB-ResNet models establish state-of-the-art or competitive results in their respective domains, as evidenced by the following performance metrics:
| Task / Metric | Baseline | DB-ResNet Variant | Improvement | Reference |
|---|---|---|---|---|
| HDR Fusion/PSNR-μ (internal) | 35.14 dB | 35.22 dB | +0.08 dB | (Marín-Vega et al., 2022) |
| SIDD Denoising/PSNR | >39.4 dB (top 2) | 39.50 dB | - | (Wu et al., 2023) |
| HSI FS-Class. (Indian Pines) | Baseline FE1 OA | DB-ResNet OA +8.6% | +8.6% | (Qin et al., 27 Apr 2025) |
| Lung Nodule Dice Score | 78.55% (CF-CNN) | 82.74% | +4.19% | (Cao et al., 2019) |
These findings indicate consistent advantage for DB-ResNet models over single-path or simple residual architectures across highly divergent data regimes and modalities.
DB-ResNet architectures represent a principled diversification of the standard residual paradigm, enabling broad, efficient, and robust representational capacity tailored to the fundamental structure of a variety of image and signal processing tasks (Marín-Vega et al., 2022, Targ et al., 2016, Qin et al., 27 Apr 2025, Cao et al., 2019, Wu et al., 2023, Liu et al., 2019).