SPatchGAN: Advanced Adversarial Discriminators

Updated 10 June 2026

The paper introduces statistical feature matching across multiple scales, replacing patch-wise classification to improve stability and fidelity over classic PatchGAN.
Skip-patch variants fuse multi-scale features with skip connections, enabling effective capture of both local textures and global structures for EM data synthesis.
Patch-wise supervised discriminators provide dense, pixel-level feedback for texture inpainting, yielding sharper outputs and enhanced structure preservation.

SPatchGAN refers to multiple distinct adversarial discriminator architectures introduced independently in the image-to-image translation, electron microscopy data generation, and texture inpainting literature. While the implementations differ in detail and application domains, all variants seek to improve upon the limitations of classic PatchGAN architectures by leveraging either statistical, multi-scale, or patch-wise supervised approaches for the discriminator, leading to improved stability, higher fidelity, and enhanced structure preservation.

1. Statistical Feature-Based Discriminator for Unsupervised Image-to-Image Translation

The original SPatchGAN architecture, introduced by Chen et al. for unsupervised image-to-image translation, fundamentally departs from PatchGAN by replacing direct patch-wise classification with distribution matching of statistical features at multiple spatial scales (Shao et al., 2021).

Architecture

Input: $x \in \mathbb{R}^{H_0 \times W_0 \times 3}$
Shared feature extraction:
- Conv $4\times4$ , stride 2, 256 SN-LReLU $\to$ $H_0/2 \times W_0/2 \times 256$
- Conv $4\times4$ , stride 2, 512 SN-LReLU $\to$ $H_0/4 \times W_0/4 \times 512$
Multi-scale pathway (for $m=1\dots4$ ):
- Downsample: Conv $4\times4$ , stride 2, 1024 SN-LReLU $\to$ $4\times4$ 0
- Adaptation: two Conv $4\times4$ 1, stride 1, 1024 SN-LReLU (output $4\times4$ 2)
Statistical feature extraction (per channel, $4\times4$ $4 \times 4$ 3):
- $4\times4$ 4: channel-wise mean (global average pooling)
- $4\times4$ 5: channel-wise max (global max pooling)
- $4\times4$ 6: channel-wise uncorrected standard deviation
Per-feature MLP discriminators: For each scale $4\times4$ 7 and statistic $4\times4$ 8, $4\times4$ 9: FC1024 SN-LReLU $\to$ 0 FC1024 SN-LReLU $\to$ 1 FC1 SN $\to$ 2 $\to$ 3

Distinct from PatchGAN, which classifies overlapping patches with a reuse of a single conv filter, SPatchGAN pools over all local regions, globally aggregating feature statistics and passing them through distinct MLPs for each statistic and scale.

2. Statistical Feature Matching and Loss Formulation

SPatchGAN replaces conventional patch-based adversarial loss with distribution matching based on statistical summaries (Shao et al., 2021).

Optimal discriminator:

$\to$ 4

Least-squares adversarial loss (LSGAN style, 0–1 coding):
- Discriminator:
$\to$ 5 - Generator:

$\to$ 6
Additional generator objectives:
- Weak forward-cycle loss on low-res images
- Identity loss on full-res target images
- Full generator loss: $\to$ 7

This statistical framework enables stable adversarial training, reduces oscillatory gradients, and allows for relaxed cycle constraints without sacrificing mode stability.

3. SPatchGAN with Skip-Patch Discriminators for Multi-Scale Adversarial Feedback

A distinct line of work employs the term SPatchGAN to designate "skip-patch" discriminator architectures, particularly in the context of synthesizing biological electron microscopy (EM) data (Roy et al., 2024). This variant fuses multiple spatial resolutions by concatenating features from different convolutional layers via skip connections:

Discriminator (Skip-Patch)

Each output decision (patch-score) aggregates information from receptive fields of 16×16, 20×20, 32×32, and 70×70 pixels.
Architecture incorporates skip connections from intermediate layers (after Conv1-4) into a fusion convolution, combining upsampled feature maps to produce a 64×64 grid of real/fake probabilities.
Generator is a U-Net (instance-norm based for artifact avoidance).
Adversarial loss sums over all patch-scores ( $\to$ 8), allowing each discriminator output to enforce both fine local textures and global consistency.

Multi-scale patch aggregation in each discriminator decision counteracts the limitations of single-scale PatchGAN, ensuring that both mesoscale structure and microtextures are respected.

4. Patch-Wise Supervision for Texture Inpainting

A third instance of SPatchGAN, introduced for texture inpainting, redefines the discriminator task as patch-level segmentation (Saad et al., 2019):

Discriminator ("segmentor") outputs a dense map predicting, for each pixel, the likelihood it is fake (i.e., inpainted).
Supervision is supplied by the inpainting mask: discriminator is optimized with binary cross-entropy, treating inpainting labels as the ground-truth segmentation.
Features are extracted at three scales (16×16, 32×32, 64×64 receptive fields via ResNet-18 backbones); maps upsampled and fused to the original resolution.
Generator is a U-Net with dilated conv bottleneck, directly propagating local contextual information through skip connections.
Objective combines segmentor loss, adversarial BCE, and reconstruction $\to$ 9 loss restricted to the mask.

This approach yields highly localized perceptual gradients, promoting sharp, context-consistent inpainting with reduced blurring and boundary artifacts.

5. Training Protocols and Hyperparameterization

Optimizer: Adam ( $H_0/2 \times W_0/2 \times 256$ 0, $H_0/2 \times W_0/2 \times 256$ 1), weight decay $H_0/2 \times W_0/2 \times 256$ 2
Batch size: 4; iterations: 500,000
Initial learning rate: $H_0/2 \times W_0/2 \times 256$ 3 decaying to $H_0/2 \times W_0/2 \times 256$ 4
Loss weights: $H_0/2 \times W_0/2 \times 256$ 5, $H_0/2 \times W_0/2 \times 256$ 6, $H_0/2 \times W_0/2 \times 256$ 7 varies (20, 10, 30)
Spectral normalization and multi-scale backbone for stability

Optimizer: Adam, lr = $H_0/2 \times W_0/2 \times 256$ 8, $H_0/2 \times W_0/2 \times 256$ 9, $4\times4$ 0
Batch size: 1 (due to data scarcity)
Epochs: 6000 for mask GAN, 1500 for EM cGAN
InstanceNorm replaces BatchNorm in both G and D
Aggressive augmentation for overfitting mitigation

Optimizer: Adam ( $4\times4$ 1, $4\times4$ 2)
Learning rates: $4\times4$ 3 (G), $4\times4$ 4 (D)
Batch size: 16
Epochs: 200
Zero-centered gradient penalty for stabilization

6. Empirical Performance and Ablations

Quantitative Evaluation

Model	FID (Selfie→Anime)	KID (Selfie→Anime)	FID (Male→Female)	KID (Male→Female)	FID (Glasses Removal)	KID (Glasses Removal)
SPatchGAN	83.3	0.0214	8.73	0.0056	13.9	0.0031
Multi-Scale PatchGAN	94.0	0.0362	—	—	—	—

Ablations indicate that removing any statistical feature (mean, max, stddev) degrades FID/KID and yields specific failure modes (color imbalance, blurriness, incoherent lines) (Shao et al., 2021).
EM generation: FID of 42.9 and SSIM of 0.81 with SPatchGAN (improvements of ≈36% and ≈31% over pix2pix) on EM images (Roy et al., 2024).
Texture inpainting: SPatchGAN achieves MPS of 97.3%, PSNR of 27.54 dB, SSIM of 0.937 on DTD, outperforming CE, GLCIC, and GLPG (Saad et al., 2019).

Qualitative Findings

Image-to-image SPatchGAN yields more coherent hair/face shapes and fine line detail.
Skip-patch SPatchGAN eliminates "checker" artifacts and enforces both thin membranes and global EM structure.
Inpainting SPatchGAN achieves sharper, boundary-consistent results, with the discriminator reliably localizing inpainted regions.

7. Significance and Implications

Statistical aggregation provides the discriminator with a global, shape-aware view and more stable gradients, facilitating larger deformations and fine-detail synthesis in translation tasks (Shao et al., 2021).
Skip-patch architectures unify local and global semantics at every decision point, overcoming the overly local focus of classic PatchGANs and accelerating convergence by ≈2× (Roy et al., 2024).
Patch-wise supervised discriminators enable direct localization of generated/fake regions, offering more informative signals for inpainting and yielding superior perceptual and pixel-wise metrics (Saad et al., 2019).

A plausible implication is that the term "SPatchGAN" functions as an umbrella for advanced discriminator designs that go beyond simple patch-wise judgments, each variant tailored to the spatial structure and semantic requirements of its target application.

Markdown Report Issue Upgrade to Chat

References (3)

SPatchGAN: A Statistical Feature Based Discriminator for Unsupervised Image-to-Image Translation (2021)

GAN with Skip Patch Discriminator for Biological Electron Microscopy Image Generation (2024)

Where is the Fake? Patch-Wise Supervised GANs for Texture Inpainting (2019)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to SPatchGAN.

SPatchGAN: Advanced Adversarial Discriminators

1. Statistical Feature-Based Discriminator for Unsupervised Image-to-Image Translation

Architecture

2. Statistical Feature Matching and Loss Formulation

3. SPatchGAN with Skip-Patch Discriminators for Multi-Scale Adversarial Feedback

Discriminator (Skip-Patch)

4. Patch-Wise Supervision for Texture Inpainting

5. Training Protocols and Hyperparameterization

Image-to-Image SPatchGAN (Shao et al., 2021)

EM SPatchGAN (Roy et al., 2024)

Inpainting SPatchGAN (Saad et al., 2019)

6. Empirical Performance and Ablations

Quantitative Evaluation

Qualitative Findings

7. Significance and Implications

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

SPatchGAN: Advanced Adversarial Discriminators

1. Statistical Feature-Based Discriminator for Unsupervised Image-to-Image Translation

Architecture

2. Statistical Feature Matching and Loss Formulation

3. SPatchGAN with Skip-Patch Discriminators for Multi-Scale Adversarial Feedback

Discriminator (Skip-Patch)

4. Patch-Wise Supervision for Texture Inpainting

5. Training Protocols and Hyperparameterization

Image-to-Image SPatchGAN (Shao et al., 2021)

EM SPatchGAN (Roy et al., 2024)

Inpainting SPatchGAN (Saad et al., 2019)

6. Empirical Performance and Ablations

Quantitative Evaluation

Qualitative Findings

7. Significance and Implications

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics