Papers
Topics
Authors
Recent
Search
2000 character limit reached

Pyramid Adaptive Atrous Convolution (PAAC)

Updated 25 January 2026
  • Pyramid Adaptive Atrous Convolution is a neural module that employs parallel atrous convolutions at multiple dilation rates to extract and fuse multi-scale contextual features.
  • The architecture integrates convolution branches with batch normalization, ReLU activations, and channel-wise attention before feeding features into Transformer layers for long-range dependency modeling.
  • Empirical results demonstrate that PAAC significantly improves detection accuracy by achieving up to 98.8% accuracy, outperforming single-rate atrous convolutions and standard CNNs.

Pyramid Adaptive Atrous Convolution (PAAC) is a convolutional neural module that enables multi-scale context extraction by integrating parallel atrous (dilated) convolutions within a pyramidal architecture. PAAC is designed to extend receptive field coverage efficiently, fuse multiscale information, and facilitate downstream attention and Transformer-based modeling, as demonstrated in the context of high-accuracy breast cancer mass detection from mammographic images (Pour et al., 18 Jan 2026).

1. Formal Definition and Functional Role

PAAC constitutes a specialized convolutional block where multiple atrous convolutions—each with distinct fixed dilation rates—are computed in parallel over the same input feature map. These branches are subsequently fused via element-wise summation to synthesize multi-scale representations. The PAAC block is positioned at the forefront of the feature extractor within the larger model pipeline. Its fused multiscale output undergoes channel-wise attention, spatial pooling, multi-scale fusion, and is finally processed by Transformer layers for comprehensive context aggregation and long-range dependency modeling.

2. Mathematical Formulation and Operational Details

The essential mechanics of PAAC revolve around the application and fusion of atrous convolutions at several dilation rates. In the general 1D case, a standard atrous convolution at dilation rate rr operates as:

y[i]=∑k=1Kx[i+r⋅k]⋅w[k]y[i] = \sum_{k=1}^{K} x[i + r \cdot k] \cdot w[k]

where x[i]x[i] denotes the input, w[k]w[k] the kernel weights, and rr the dilation factor. Extending this to PAAC for 2D feature maps, adaptation is implemented by executing three parallel convolutions with dilation rates {1,2,3}\{1, 2, 3\}:

fdr=W∗rfinf_{d_r} = W *_{r} f_{in}

Here, ∗r*_{r} designates convolution with dilation rate rr. Fusion is performed as an element-wise sum across branches:

fPAAC=fd1+fd2+fd3f_{PAAC} = f_{d_1} + f_{d_2} + f_{d_3}

Each branch utilizes a 3×33 \times 3 kernel, stride of 1, and 'same' padding to preserve spatial dimensionality. Batch normalization and ReLU activation are applied post-convolution, yielding three 64-channel outputs fused into a singular 64-channel feature map.

3. Architectural Composition and Attention Mechanism

PAAC is constructed using three parallel Conv2D branches with 3×33\times3 kernels, input channels =1=1, output channels =64=64, and dilation rates r=1,2,3r=1, 2, 3. Each branch output is subject to BatchNorm2D and ReLU activation. The outputs are combined element-wise, resulting in a fused feature tensor of dimensions [64×227×227][64\times227\times227].

A downstream channel-wise attention block follows, comprising global average- and max-pooling, concatenated and processed through two dense layers with sigmoid activation to generate a [64×1×1][64\times1\times1] attention mask. This mask is multiplied with the PAAC output to scale informative channels and suppress noise.

A schematic pseudocode representation is as follows:

1
2
3
4
5
6
7
8
9
10
11
def PAAC(fin):  # fin: [B, C=1, H=227, W=227]
    outs = []
    for r in [1, 2, 3]:
        x = Conv2D(in_channels=1, out_channels=64,
                   kernel_size=3, stride=1,
                   padding='same', dilation=r)(fin)
        x = BatchNorm2D(64)(x)
        x = ReLU()(x)
        outs.append(x)
    fused = outs[0] + outs[1] + outs[2]   # shape [B,64,227,227]
    return fused

4. Integration with Transformer Architecture and Multi-Scale Fusion

Once PAAC features are generated and attended, they undergo max-pooling (2×22\times2, stride =2=2) to reduce spatial dimensions to [64×113×113][64\times113\times113]. Coarser features obtained in parallel are concatenated or summed to form a [192×113×113][192\times113\times113] feature map. This map is reshaped into a sequence for input into multi-head self-attention Transformer layers, which are responsible for leveraging long-range dependencies. The final output is flattened and classified via a fully connected layer and softmax for binary breast cancer decision (benign vs. malignant).

Multi-Scale Feature Fusion in this context integrates both the PAAC output and coarser encoder features to comprehensively encode both fine and global structures, enhancing model discrimination capacity for mass detection.

5. Implementation Parameters, Hyperparameter Settings, and Ablation

Reported hyperparameters:

  • Optimizer: Adam, learning rate 1×10−41\times10^{-4}
  • Batch size: 32
  • Number of epochs: 100
  • Loss function: Dice + Focal Loss (weights λ1,λ2\lambda_1, \lambda_2 tuned on validation)
  • Weight initialization: He normal for Conv layers

Key layer configurations (from Table 1 and Fig. 3 in (Pour et al., 18 Jan 2026)):

Layer Input Shape Output Shape Dilation Rates Kernel Size Activation
PAAC Branch i 1×227×227 64×227×227 1, 2, 3 3×3 BN + ReLU
Fused PAAC three branches 64×227×227 — — sum
Channel Attention 64×227×227 64×1×1 — — sigmoid
MaxPool2D 64×227×227 64×113×113 — 2×2 —
Multi-Scale Fusion 64×113×113+coarse 192×113×113 — — concat/sum
Transformer Sequence 192×113×113 16×(113×113×192) — — MHSA, FFN
FC Output flattened 2 — — softmax

6. Quantitative Performance and Comparative Analysis

Ablation and benchmarking results (Table 2 in (Pour et al., 18 Jan 2026)) establish PAAC's contribution over baseline architectures:

  • No PAAC (standard CNN): accuracy ≈95.0%\approx 95.0\%
  • Single-rate atrous convolution (r=2r=2): ≈97.0%\approx97.0\%
  • PAAC (multi-rate pyramid, no transformer): ≈98.5%\approx98.5\%
  • Full PAAC + Transformer: 98.8%98.8\% accuracy, 99.42%99.42\% sensitivity, 98.01%98.01\% specificity, F1-score 98.93%98.93\%

Comparison against Multi-Scale CNN (Zhang et al.): PAAC + Transformer yields $0.3$ percentage points improvement (98.8%98.8\% vs. 98.5%98.5\%). The integration of PAAC results in approximately $1.5$ points gain compared to single-rate atrous convolution and $3.8$ versus standard CNN, substantiating the value of multi-scale parallel dilation.

7. Architectural Visualizations and Module Properties

Figure 1 in (Pour et al., 18 Jan 2026) illustrates the PAAC pipeline: parallel atrous branches, fusion, channel attention, pooling, multi-scale feature fusion, and the Transformer stack. PAAC is characterized as lightweight and parameter-efficient, dynamically extending receptive field coverage via simultaneous multi-dilation convolutions. Downstream attention mechanisms are pivotal in highlighting relevant channel-wise features for accurate discrimination.

PAAC is embedded as an initial block in a combined CNN–Transformer workflow. Its parameter efficiency and adaptability position it as a salient module for complex medical image analysis tasks where nuanced multi-scale texture information is critical.


In summary, Pyramid Adaptive Atrous Convolution enables efficient multi-scale feature extraction through a parallel, fixed-rate pyramidal approach, fused and scaled via attention, and integrated into Transformer-based medical image analysis pipelines. Its empirical gains in breast cancer detection accuracy indicate its robustness and utility for discriminative tasks involving complex spatial structures (Pour et al., 18 Jan 2026).

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Pyramid Adaptive Atrous Convolution (PAAC).