Papers
Topics
Authors
Recent
Search
2000 character limit reached

Residual-in-Residual Dense Block (RRDB)

Updated 15 February 2026
  • Residual-in-Residual Dense Block (RRDB) is a hierarchical structure that integrates densely connected convolutional layers within multi-level residual connections.
  • It enhances feature reuse and gradient stability, enabling efficient training of very deep networks for challenging image restoration and surrogate modeling tasks in 2D and 3D settings.
  • Empirical studies show RRDB improves accuracy, convergence speed, and generalization compared to traditional residual and dense block designs.

A Residual-in-Residual Dense Block (RRDB) is a hierarchical, multi-level residual architecture that integrates densely connected convolutional layers within residual blocks and introduces an additional outer residual connection. RRDBs are designed to exploit both feature reuse and gradient stability, enabling highly expressive, very deep neural networks for challenging image restoration and surrogate modeling tasks in both 2D and 3D domains. Unlike standard residual or dense blocks, RRDBs employ stacked dense modules, each wrapped in residual connections with explicit residual scaling, and then compose several such modules within a further residual structure. The architecture has been validated in artifact removal, surrogate modeling for non-Gaussian fields, and volumetric medical image super-resolution, demonstrating improved accuracy, training stability, and generalization relative to conventional alternatives (Zini et al., 2019, Mo et al., 2019, Ha et al., 2024).

1. Architectural Principle and Formulation

An RRDB interleaves two or more levels of residual learning with dense connectivity. Each block encapsulates a cascade of Residual Dense Blocks (RDBs), inside which each convolutional unit receives as input the concatenation of all preceding activations within the block, promoting extensive feature reuse.

2D RRDB

For input x0∈RH×W×Cx_0 \in \mathbb{R}^{H \times W \times C}, the canonical architecture is as follows (Zini et al., 2019, Mo et al., 2019):

  • Each RDB comprises LL (=5=5) densely connected convolutional layers, yielding activations hi,kh_{i,k} for RDB ii, layer kk:

hi,k=σ(Wi,k⋅[xi−1,hi,1,…,hi,k−1])h_{i,k} = \sigma ( W_{i,k} \cdot [ x_{i-1}, h_{i,1}, \ldots, h_{i,k-1} ] )

where σ\sigma is LeakyReLU or Mish/ReLU, Wi,kW_{i,k} are 3×33 \times 3 convolution kernels, and [ ⋅ ][\,\cdot\,] denotes channelwise concatenation. No batch normalization is used in (Zini et al., 2019).

  • The outputs DiD_i are concatenated:

Di=concat(xi−1,hi,1,...,hi,L)∈RH×W×(C⋅(L+1))D_i = \mathrm{concat}(x_{i-1}, h_{i,1}, ..., h_{i,L}) \in \mathbb{R}^{H \times W \times (C \cdot (L+1))}

  • The local output is collapsed back to CC channels, then added to the block input with residual scaling (β=0.2\beta = 0.2):

bi=Wi,agg∗Di,xi=xi−1+β bib_i = W_{i,\mathrm{agg}} * D_i,\qquad x_i = x_{i-1} + \beta\,b_i

  • After PP consecutive RDBs, an outer residual is applied:

FRRDB=Wout∗xP,xout=x0+α FRRDBF_{\mathrm{RRDB}} = W_{\mathrm{out}} * x_P,\qquad x_{\mathrm{out}} = x_0 + \alpha\,F_{\mathrm{RRDB}}

where α=0.2\alpha=0.2.

3D RRDB

In volumetric applications, all convolutions and concatenations are replaced by their 3D analogues (kernels 3×3×33 \times 3 \times 3), with dense block growth rate gg (typ. $32$), channel count C=64C=64 (Ha et al., 2024).

2. Key Design Choices

Core architectural decisions include:

  • Densely connected convolutions: Each conv receives all prior layer outputs as input, maximizing information flow.
  • Multi-level residual learning: Each dense block is wrapped by a residual skip; several such blocks are stacked and collectively wrapped by an outer skip. This is sometimes called "residual-in-residual."
  • Residual scaling: All residual connections are multiplied by a scaling constant (β\beta, typically $0.2$), preventing gradient explosion.
  • Absence of batch normalization: For image restoration and super-resolution tasks, batch normalization is eliminated to preserve intensity information and reduce artifacts (Zini et al., 2019, Ha et al., 2024). However, in some surrogate modeling variants, batch normalization is retained (Mo et al., 2019).
  • Activations: LeakyReLU ($0.2$ slope) predominates for image tasks, but Mish and ReLU are employed in regression/surrogate settings (Mo et al., 2019).
  • Weight initialization: Kaiming normal (He) initialization, scaled by $0.1$ (Zini et al., 2019).
Property Typical 2D Setting 3D Variant
Dense block layers (LL) 5 5
Growth rate (gg) 32 or 48 32
Channels per conv (CC) 64 64
Batch norm No (restoration); Yes (surrog.) No (restoration); Yes (surrog.)
Activations LeakyReLU (0.2) LeakyReLU (0.2)
Residual scaling (β\beta) 0.2 0.2

3. Mathematical Summary

Let xx denote the RRDB input. The computation can be formally summarized as (using Editor's notation for generality):

  • Dense block (per RDB):

f0=x;fi=σ(Wi∗[f0,...,fi−1]),  i=1..Lf_0 = x; \quad f_i = \sigma (W_i * [f_0, ..., f_{i-1}]),\; i=1..L

  • Local Feature Fusion (LFF):

FLFF=WLFF∗[f0,...,fL]F_{\mathrm{LFF}} = W_{\mathrm{LFF}} * [f_0, ..., f_L]

  • Local residual:

yRDB=f0+βFLFFy_{\mathrm{RDB}} = f_0 + \beta F_{\mathrm{LFF}}

  • Repetition: This RDB structure is stacked PP times (usually $3$ or $5$).
  • Global (outer) residual:

yRRDB(x)=x+α yRDB(P)y_{\mathrm{RRDB}}(x) = x + \alpha\, y_{\mathrm{RDB}}^{(P)}

Parameter count and per-layer shapes scale with CC, gg, LL, and spatial dimension (see (Ha et al., 2024) for explicit formulas).

4. Integration within Deep Models

RRDBs function as central "feature learning units" in diverse deep architectures:

  • Autoencoders for JPEG Artifact Removal (2D and 3D): In (Zini et al., 2019), RRDBs are the central part of both Y-Net (luminance) and CbCr-Net (chroma), each employing encoder–RRDB(s)–decoder skeletons, replacing plain or batch-normalized residual blocks entirely. Y-Net stacks 5 RRDBs; CbCr-Net uses 3 RRDBs and a 3D first-layer convolution.
  • Surrogate Modeling in Subsurface Inversion: (Mo et al., 2019) uses RRDBs for the Deep Residual Dense Convolutional Network (DRDCN), embedding 4 RRDBs in an encoder–bottleneck–decoder configuration and wrapping all RRDBs in an additional global residual skip.
  • Volumetric Medical SR with GANs: (Ha et al., 2024) extends the full body of a GAN generator with a chain of 3D RRDBs, strictly following the residual-in-residual design. The discriminator does not use RRDBs, and 2.5D perceptual loss is applied only to output volumes, not intermediate activations.

There are no explicit long-range encoder–decoder skips outside the RRDB pathway in these applications (Zini et al., 2019).

5. Empirical Benefits and Performance

Reported empirical effects of RRDB usage include:

  • Convergence acceleration and training stability: RRDB-equipped models achieve target errors with markedly fewer training samples or epochs than models based on plain dense blocks or standard ResNet modules (Mo et al., 2019).
  • Gradient flow and feature preservation: Multi-level residual learning mitigates vanishing/exploding gradient effects, thus supporting successfully training networks up to 69 layers deep (Mo et al., 2019).
  • Accuracy improvements: In surrogate modeling, DRDCN with RRDB reached R2≈0.97R^2 \approx 0.97 (RMSE ≈0.016\approx 0.016) versus R2≈0.93R^2 \approx 0.93 (RMSE ≈0.024\approx 0.024) for plain dense blocks (Mo et al., 2019). In JPEG restoration, both PSNR and SSIM improve relative to previous state-of-the-art, with the same model handling all compression qualities (Zini et al., 2019).
  • Generality and robustness: RRDB-powered models generalize across input domains and scales, such as JPEG images of unknown quality or volumetric radiology datasets with variable characteristics (Zini et al., 2019, Ha et al., 2024).

6. Comparison to Alternative Block Designs

Standard alternatives include:

  • Residual blocks (ResNet): Facilitate deep network training via shortcut connections but lack dense intra-block feature reuse.
  • Dense blocks (DenseNet): Provide dense connectivity but, without residual scaling or multi-level residuals, become difficult to train at great depth.
  • RRDBs: Combine the advantages: dense feature propagation, stable deep learning, and controlled residual scaling, which enable deeper and broader models without sacrificing convergence or introducing instability (Mo et al., 2019).

A plausible implication is that RRDBs are particularly effective where both depth and parameter efficiency are critical and domain artifacts (e.g., compression noise, surrogate complexity, volumetric structures) demand hierarchical, stable representations.

7. Variants and 3D Extensions

The 3D extension of RRDB replaces all spatial operations with their volumetric counterparts. Key differences:

  • Kernel expansion: 2D 3×33\times3 kernels become 3×3×33\times3\times3, receptive field grows volumetrically, parameter count increases ≈3×\approx3\times per conv.
  • Memory scaling: Parameter and feature map storage increase substantially, leading to high computational requirements for large 3D inputs.
  • Perceptual loss adaptation: In (Ha et al., 2024), the 2.5D perceptual loss operates on 2D slices extracted along major anatomical axes, using pretrained 2D CNNs (VGG19), not intermediate RRDB features.

This suggests RRDBs are structurally adaptable, retaining their multi-level residual advantage in higher dimensions and across tasks from restoration to surrogate modeling and super-resolution (Ha et al., 2024).

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Residual-in-Residual Dense Block (RRDB).