Gradient Supplementary Module (GSM)

Updated 8 January 2026

GSM is a neural module that integrates raw gradient data using convolution and SE attention blocks to enhance edge positioning in infrared images.
It fuses gradient and main branch features through a residual pathway, improving spatial detail and small target discrimination.
Ablation studies show that the full GSM configuration achieves an IoU of 0.8142, validating its effectiveness in enhancing edge fidelity.

The Gradient Supplementary Module (GSM) is a neural architecture component introduced in the "Gradient-Guided Learning Network for Infrared Small Target Detection" for the purpose of encoding raw gradient information into deep network layers. Designed specifically to alleviate the problem of inaccurate edge positioning and improve small target discrimination in infrared imagery, GSM systematically fuses gradient magnitude information with learned feature maps, thus enhancing spatial detail representation and feature extraction capacity (Zhao et al., 10 Dec 2025).

1. Raw Gradient Magnitude Computation

GSM operates on the gradient magnitude image derived from the input intensity map. Although the precise operator is not specified, the referenced computation in the paper aligns with standard image processing practices. The 2D gradient magnitude at pixel $(x, y)$ is computed as

$G_{x}(x, y) = I(x+1, y) - I(x-1, y), \quad G_{y}(x, y) = I(x, y+1) - I(x, y-1)$

$G(x, y) = \sqrt{G_{x}(x, y)^2 + G_{y}(x, y)^2}$

A typical implementation employs 3×3 Sobel kernels for $G_x$ and $G_y$ , with gradient magnitude $G$ assembled by channel-wise $\sqrt{G_x^2 + G_y^2}$ . This approach ensures the extraction of edge details necessary for robust infrared target delineation.

2. Structural Composition of GSM

GSM is composed of two primary blocks at each Stage of the network:

G_Block: This consists of a single 3×3 convolution (padding=1, stride=1) applied to a spatially pooled gradient magnitude map, immediately followed by a squeeze-and-excitation (SE) attention block.
Res (Residual Fusion): The Res block performs feature fusion, combining the main branch feature map $F_\text{main}$ and the SE-weighted output of G_Block, $F_\text{grad}$ , via element-wise summation:

$F_\text{out} = F_\text{main} + F_\text{grad}'$

where $G_{x}(x, y) = I(x+1, y) - I(x-1, y), \quad G_{y}(x, y) = I(x, y+1) - I(x, y-1)$ 0 is the SE transformation of the gradient feature.

The SE block follows the established formula: $G_{x}(x, y) = I(x+1, y) - I(x-1, y), \quad G_{y}(x, y) = I(x, y+1) - I(x, y-1)$ 1

$G_{x}(x, y) = I(x+1, y) - I(x-1, y), \quad G_{y}(x, y) = I(x, y+1) - I(x, y-1)$ 2

with $G_{x}(x, y) = I(x+1, y) - I(x-1, y), \quad G_{y}(x, y) = I(x, y+1) - I(x, y-1)$ 3 reducing the channel width by a ratio $G_{x}(x, y) = I(x+1, y) - I(x-1, y), \quad G_{y}(x, y) = I(x, y+1) - I(x, y-1)$ 4 and $G_{x}(x, y) = I(x+1, y) - I(x-1, y), \quad G_{y}(x, y) = I(x, y+1) - I(x, y-1)$ 5 restoring the channel dimension. Channel and architectural hyperparameters are not enumerated in the referenced work.

3. Attention Mechanisms

Channel attention within GSM utilizes a squeeze-and-excitation (SE) block to recalibrate the feature responses adaptively. The channel-wise weights $G_{x}(x, y) = I(x+1, y) - I(x-1, y), \quad G_{y}(x, y) = I(x, y+1) - I(x, y-1)$ 6 are derived by globally averaging each channel, then passing through a two-layer fully connected bottleneck (with reduction ratio $G_{x}(x, y) = I(x+1, y) - I(x-1, y), \quad G_{y}(x, y) = I(x, y+1) - I(x, y-1)$ 7) and sigmoid activation. The reweighted features are produced as

$G_{x}(x, y) = I(x+1, y) - I(x-1, y), \quad G_{y}(x, y) = I(x, y+1) - I(x, y-1)$ 8

No spatial attention or gating is introduced in GSM beyond this channel-wise mechanism.

4. Fusion of Gradient and Main Branch Features

At each Stage, GSM fuses gradient-derived and main branch features via a residual pathway:

The raw gradient map is spatially downsampled (max-pooling) to the resolution of the current Stage.
Convolution with a 3×3 kernel produces intermediate feature $G_{x}(x, y) = I(x+1, y) - I(x-1, y), \quad G_{y}(x, y) = I(x, y+1) - I(x, y-1)$ 9.
SE attention reweights $G(x, y) = \sqrt{G_{x}(x, y)^2 + G_{y}(x, y)^2}$ 0.
The main branch feature $G(x, y) = \sqrt{G_{x}(x, y)^2 + G_{y}(x, y)^2}$ 1 and SE-weighted gradient feature $G(x, y) = \sqrt{G_{x}(x, y)^2 + G_{y}(x, y)^2}$ 2 are summed:

$G(x, y) = \sqrt{G_{x}(x, y)^2 + G_{y}(x, y)^2}$ 3

This mechanism directly incorporates gradient information, biasing the feature maps toward enhanced edge preservation.

5. Integration Within Dual-Branch Network Architecture

GSM is deployed at the terminus of every Stage ( $G(x, y) = \sqrt{G_{x}(x, y)^2 + G_{y}(x, y)^2}$ 4) in the main branch of the dual-branch network. In parallel, the supplementary branch sequentially pools the original gradient magnitude image, matching its resolution to each Stage before feeding it to the corresponding GSM instance. This dual path ensures multi-scale extraction and integration of gradient information throughout the network depth.

6. Ablation Study and Empirical Results

Controlled ablation experiments outlined in Table II of the reference demonstrate the efficacy of GSM. When the residual fusion is replaced with simple addition (M_G_Add), the IoU drops by 1.28%. Shifting the G_Block to the supplementary branch (M-G-M_Res) diminishes IoU by 0.34%. The network with the full GSM achieves the highest metrics (IoU: 0.8142, nIoU: 0.7858), confirming its contribution to edge fidelity and small target discrimination in infrared images.

Variant	IoU	nIoU	Relative Performance
M_G_Add	0.8038	0.7797	−1.28%, −0.78%
M-G-M_Res	0.8108	0.7817	−0.34%, −0.41%
Full GSM	0.8142	0.7858	Baseline

7. Implementation Details and Hyperparameters

Key training parameters are:

Learning rate: $G(x, y) = \sqrt{G_{x}(x, y)^2 + G_{y}(x, y)^2}$ 5
Batch size: 4
Number of epochs: 500
Optimizer: Not explicitly stated (likely Adam)
Weight initialization: Not detailed (default PyTorch assumed)
Regularization: Not specified beyond typical weight decay

With code availability provided at the referenced GitHub repository, all essential implementation details for GSM are sourced from the original work (Zhao et al., 10 Dec 2025). No additional architectural, optimization, or regulatory specifics are enumerated.

Markdown Report Issue Upgrade to Chat

References (1)

Gradient-Guided Learning Network for Infrared Small Target Detection (2025)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Gradient Supplementary Module (GSM).

Gradient Supplementary Module (GSM)

1. Raw Gradient Magnitude Computation

2. Structural Composition of GSM

3. Attention Mechanisms

4. Fusion of Gradient and Main Branch Features

5. Integration Within Dual-Branch Network Architecture

6. Ablation Study and Empirical Results

7. Implementation Details and Hyperparameters

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

Gradient Supplementary Module (GSM)

1. Raw Gradient Magnitude Computation

2. Structural Composition of GSM

3. Attention Mechanisms

4. Fusion of Gradient and Main Branch Features

5. Integration Within Dual-Branch Network Architecture

6. Ablation Study and Empirical Results

7. Implementation Details and Hyperparameters

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research