Papers
Topics
Authors
Recent
2000 character limit reached

Progressive Group-Wise Test-Time Training (PGT)

Updated 3 December 2025
  • The paper introduces PGT, a method that progressively improves pixel-level anomaly segmentation using test-time adaptation of a lightweight ARD.
  • PGT leverages a frozen ZSAD backbone to extract coarse features and refines them through iterative training on pseudo-normal and synthetically augmented images.
  • Empirical results on datasets like MVTec AD and VisA demonstrate significant gains in pixel-AP and F1 scores while preserving the underlying model’s stability.

Progressive Group-Wise Test-Time Training (PGT) is a test-time adaptation strategy designed for industrial anomaly segmentation scenarios, particularly in mass-production pipelines where nearly identical parts are inspected sequentially in large quantities. PGT was introduced in the context of zero-shot anomaly detection (ZSAD) approaches, which by design only use unlabeled test-time data and typically operate at coarse, patch-level resolutions. The method targets the refinement of low-resolution anomaly maps into pixel-precise segmentations by progressively adapting a lightweight Anomaly Refinement Decoder (ARD) across groups of incoming product images, thereby improving anomaly localization without access to real anomaly ground truth or labeled data (Huang et al., 27 Nov 2025).

1. Architecture and Operational Framework

PGT is centered around two modular components: a frozen ZSAD backbone (e.g., APRIL-GAN, VCP-CLIP, MuSc) and a light-weight ARD. The ZSAD backbone extracts Vision Transformer (ViT) patch-level features FRH/14×W/14×CF \in \mathbb{R}^{H/14 \times W/14 \times C} and generates a coarse anomaly map ARH/14×W/14×1A \in \mathbb{R}^{H/14 \times W/14 \times 1}. The ARD refines these into high-resolution maps Aˉ\bar{A}. Only ARD parameters θ\theta are updated during test-time adaptation, ensuring the ZSAD model's integrity against catastrophic forgetting or distributional drift. PGT is designed to be compatible with any ZSAD model operating on ViT features.

2. Algorithmic Workflow

Given a stream of incoming test images partitioned into KK non-overlapping groups G1,...,GKG_1, ..., G_K, each of size uu (empirically u=30u=30 yields the best tradeoff), PGT applies an iterative process comprising:

  1. Coarse Anomaly Estimation: For group GiG_i, compute patch-level anomaly maps AjA_j and image-level anomaly scores sj=max(Aj)s_j = \max(A_j).
  2. Pseudo-Normal Selection: Rank GiG_i by sjs_j and select the rr images with lowest scores as “pseudo-normal” (typically r=5r=5).
  3. Pseudo-Anomaly Synthesis: For each pseudo-normal image I+I^+, generate ll synthetic anomalies (usually l=4l=4) by masking (Perlin noise), pasting random textures (from DTD), and forming pairs (I~,M)(\tilde{I}, M).
  4. Test-Time Training: Fine-tune ARD on these synthetic pairs for E=20E=20 epochs using a Dice loss objective:

L(θ)=p=1rlDice(Aˉp,Mp),Dice(X,Y)=12XYX+Y+ε\mathcal{L}(\theta) = \sum_{p=1}^{r\,l} \mathrm{Dice}\left(\bar{A}_p,\, M_p\right), \quad \mathrm{Dice}(X,Y) = 1 - \frac{2\sum X\odot Y}{\sum X + \sum Y + \varepsilon}

  1. Refinement: For i>1i>1, use the updated ARD to refine incoming images in GiG_i, producing final anomaly maps as the average of coarse and refined outputs: FinalMapj=(Aj+Aˉj)/2\mathrm{FinalMap}_j = (A_j + \bar{A}_j)/2.

At every group, this process enables the ARD to incrementally adapt, counteracting the distributional gap between synthetic and real anomalies without introducing overfitting or semantic drift.

3. Motivation and Advantages

ZSAD systems do not require annotated anomalies and instead rely on feature-level novelty detection for candidate lesions. However, this produces coarse, low-granularity results due to the inherent spatial resolution of ViT features. Offline-trained, supervised refinement decoders have limited generalizability due to the distribution gap between pseudo and real anomalies.

PGT is motivated by several needs:

  • Leveraging mass-production regularities: Groups of near-identical products in sequential batches.
  • Unsupervised adaptation: Utilizing only the unlabeled, streaming test data.
  • Distributional robustness: Adapting in small increments per group to reduce both overfitting and training bias from unrepresentative synthetic anomalies.
  • Preservation of data flow: Adhering to grouped processing constraints customary in industrial visual inspection.

PGT differentiates itself from single-image adaptation and dataset-level adaptation by operating under realistic industrial constraints and specifically targeting spatially fine segmentation improvement at test-time.

4. Hyperparameter Selection and Robustness

Empirical ablations on the MVTec AD and VisA datasets guide preferred hyperparameters:

  • Group size u=30u=30 (comparable results for u{20,40}u\in\{20,40\}, within ±0.5% of best metrics).
  • Pseudo-normal selection r=5r=5, with performance peaking around this value.
  • Number of pseudo-anomalies per normal l=4l=4, with diminishing returns beyond that.
  • Learning rate α=103\alpha=10^{-3} with SGD; 20 epochs per group (fixed).
  • Synthetic mask and texture choices: Perlin noise and DTD textures as default, while variants like CutPaste and SeaS can be substituted with small or slightly improved performance impacts.

PGT's robustness to imperfect pseudo-normal selection is documented: up to 40% contamination with true anomalies among pseudo-normals results in only a 0.8% drop in pixel-AP, reflecting the method's resilience to noisy group composition.

5. Empirical Results and Performance

PGT yields measurable improvements in pixel-level anomaly segmentation across multiple ZSAD backbones and datasets:

  • VCP-CLIP+PGT on MVTec AD: pixel-AP improves by +2.5%, F1 by +2.8%.
  • MuSc+PGT: up to +5.1% gain in pixel-AP.
  • VisA dataset: up to +5.2% in segmentation AP.
  • Full progressive group-wise adaptation achieves higher scores (pixel-AP=51.6%) than training only on the initial group (pixel-AP=51.0%), confirming the benefit of incremental adaptation.
  • Averaging coarse and refined outputs effectively suppresses false positives and negatives.

Independence between per-group pseudo-normals/anomalies is shown to be more effective than sample sharing across groups.

6. Integration Strategy and Deployment Considerations

PGT is modular and can be integrated into any ZSAD pipeline with ViT backbone by inserting the ARD downstream of the patch-level anomaly map. The main computational cost arises from retraining ARD for 20 epochs per group, typically processed offline between inspection runs in mass-production settings. Only the ARD undergoes parameter updates, ensuring that the underlying ZSAD remains static and empirically stable. This design limits the risk of catastrophic drift often observed in end-to-end test-time adaptation schemes.

Ablative studies underscore that diverse pseudo-normal sampling outweighs the impact of synthetic anomaly details. The method tolerates high levels of label noise in pseudo-normals and benefits cumulatively from exposure to varied synthetic training conditions across groups.

7. Limitations and Prospective Directions

PGT introduces a computational test-time overhead due to episodic ARD fine-tuning. This limitation is mitigated in industrial workflows by offline group processing. The method's efficacy is enhanced by diversity among synthetic anomalies and pseudo-normals; its performance plateaus when synthetic augmentation variation is saturated.

This suggests that further benefits may accrue from dynamically selecting augmentation strategies or incorporating domain simulation to reduce the remaining distribution gap. Nevertheless, PGT provides a principled operational endpoint for ZSAD refinement under real-world production constraints, achieving improved fine-grained segmentation with only unlabeled test-time data and without sacrificing image-level anomaly detection performance (Huang et al., 27 Nov 2025).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)
Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Progressive Group-Wise Test-Time Training (PGT).