Papers
Topics
Authors
Recent
Search
2000 character limit reached

ReplayCAD: Generative Replay in Vision & CAD

Updated 16 May 2026
  • ReplayCAD is a dual-use framework that generatively replays content for both continual anomaly detection and CAD program reconstruction.
  • Its anomaly detection pipeline condenses historical data into semantic tokens and spatially guided features to mitigate catastrophic forgetting and enhance segmentation.
  • The CAD pipeline converts 3D point clouds into editable, human-readable Python scripts, effectively bridging raw sensor data with procedural design.

ReplayCAD refers to two distinct but seminal technologies, each addressing challenges in a different computational domain. In the continual learning and computer vision literature, ReplayCAD denotes a generative diffusion replay framework for continual anomaly detection, leveraging pre-trained latent diffusion models to replay class-conditioned image samples for robust classification and fine-grained segmentation. In the geometric reasoning and CAD (computer-aided design) community, ReplayCAD references a sequence-to-sequence architecture for reconstructing complete, editable CAD programs from 3D point clouds, providing a pipeline from sensor data to executable geometric scripts. Both lines of research converge on the replay of generative content—either to mitigate catastrophic forgetting in continual learning or to facilitate human-in-the-loop CAD reverse engineering.

1. Overview and Conceptual Foundations

In continual anomaly detection (CAD), the principal objective is to learn and preserve the ability to identify and localize anomalies across multiple class distributions acquired in a sequential (incremental) fashion. The two persistent challenges are catastrophic forgetting—where performance on previously seen classes degrades upon learning new ones—and the loss of pixel-level detail necessary for precise segmentation, especially of small, industrial defects. Traditional CAD methods fall into feature-replay and regularization paradigms, both of which inadequately preserve pixel fidelity required for robust segmentation (Hu et al., 10 May 2025).

ReplayCAD in the anomaly detection sense introduces a mechanism where historical data is condensed into a small set of generative "conditioning seeds"—combinations of semantic and spatial tokens—enabling a pre-trained latent diffusion model (LDM) to reconstruct high-fidelity historical samples with controlled spatial diversity. These generated images are replayed during anomaly-detector training, directly addressing both catastrophic forgetting and segmentation challenges.

Simultaneously, in CAD geometry, ReplayCAD refers to systems that bridge raw 3D shape representations (e.g., point clouds) and the generative replay of original design intent as executable CAD scripts. The core architectural idea is to employ a LLM with a point-cloud-to-sequence transducer, mapping geometric embeddings to Python/CadQuery code, enabling "replay" of the complete procedural construction of a shape and supporting subsequent editing or analysis (Rukhovich et al., 2024).

2. Generative Diffusion Replay for Continual Anomaly Detection

ReplayCAD's anomaly detection framework harnesses a pre-trained LDM as a generative backbone, optimized in the latent space with the following DDPM loss:

LLDM=EzE(x),ϵN(0,I),t,c[ϵϵθ(z,t,c)22]L_\mathrm{LDM} = \mathbb{E}_{z\sim\mathcal{E}(x), \epsilon \sim \mathcal{N}(0,I), t, c} \left[ \|\epsilon - \epsilon_\theta(z,t,c)\|^2_2 \right]

Here, E\mathcal{E} is the image encoder, ϵθ\epsilon_\theta is a denoising U-Net, tt is the diffusion timestep, and cc is a composite conditioning embedding.

Semantic Token Compression

For each class ii, historical normals Dhist(i)D_\mathrm{hist}^{(i)} are summarized as learnable semantic tokens vRK×Cv \in \mathbb{R}^{K\times C}. The concatenated prompt encoding and tokens, esemantic=[T(p)v]e_\mathrm{semantic} = [\mathcal{T}(p) \,|\, v], are optimized by minimizing the LDM loss over Dhist(i)D_\mathrm{hist}^{(i)}, with the solution E\mathcal{E}0 stored for generative replay.

Spatial Feature Guidance

To induce spatial diversity in replayed images, ReplayCAD incorporates mask-based spatial guidance. For each E\mathcal{E}1, a mask E\mathcal{E}2 (computed using Segment Anything Model, SAM) is encoded, transformed via an MLP E\mathcal{E}3, and concatenated to E\mathcal{E}4, producing E\mathcal{E}5. The joint optimization

E\mathcal{E}6

enables ReplayCAD to control both semantic identity and spatial variability in replay data.

Generative Replay Pipeline

ReplayCAD operates in two stages:

  • Compression: Store E\mathcal{E}7 and E\mathcal{E}8 masks E\mathcal{E}9 per class, optimizing each over several hundred steps (Adam, LR = ϵθ\epsilon_\theta0).
  • Replay and Training: For each previous class, replay ϵθ\epsilon_\theta1 diversified, high-resolution samples per class via denoising from random latents, using sampled or perturbed masks, and combine with new-class normals for anomaly detector retraining.

The anomaly detector (e.g., InvAD) is trained on this union, supervised solely by a reconstruction loss on feature representations, without explicit segmentation supervision.

3. Reverse Engineering and Replay of CAD Programs

In computational geometry and CAD, ReplayCAD describes a system that reconstructs full, human-readable CAD programs from 3D data. The task is formalized as a sequence transduction: mapping a point cloud ϵθ\epsilon_\theta2 (positions, normals) to a valid Python script representing a sequence of sketch and extrude operations in CadQuery.

CAD Sequence Representation

The output is an executable Python program using CadQuery DSL primitives—such as .sketch(), .extrude(), .union(), and box, cylinder functions—enabling parameter reuse, loops, or functional decomposition. For example:

ϵθ\epsilon_\theta6

Network Architecture

A lightweight point-cloud projector transforms sampled points into “query tokens” using positional encodings and linear projection. These tokens are appended to token embeddings of the partially generated script and fed to an LLM, e.g., Qwen2-1.5B, to autoregressively predict the next Python code token.

Training

Training leverages a synthetic corpus of ϵθ\epsilon_\theta3 valid CadQuery scripts and corresponding point clouds. The model minimizes the standard negative log-likelihood objective on code tokens, with AdamW (LR ϵθ\epsilon_\theta4), cosine learning rate decay, and is trained for 100k iterations on H100 GPUs.

4. Empirical Evaluation and Results

Continual Anomaly Detection

Evaluation on VisA (12 classes, 8659/2162 train/test) and MVTec (15 classes, 3629/1725 train/test) benchmarks is conducted using image-level AUROC, pixel-AP (area under precision–recall), and the forgetting measure (FM). ReplayCAD achieves state-of-the-art segmentation and classification performance:

Dataset Baseline (UCAD) ReplayCAD Gain (Pixel-AP)
VisA 30.0 41.5 +11.5
MVTec 45.6 53.7 +8.1

ReplayCAD also demonstrates strong resistance to forgetting (FM ≈ 5 pp), and ablations show the integration of semantic and spatial conditioning is required for peak performance (Hu et al., 10 May 2025).

CAD Program Recovery

On DeepCAD, Fusion360, and CC3D benchmarks, ReplayCAD attains superior geometric reconstruction accuracy (median Chamfer Distance 0.168 × ϵθ\epsilon_\theta5, IoU 87.6%, Invalidity Ratio 0.5%), dramatically surpassing previous state-of-the-art:

Method Mean CD↓ Median CD↓ IoU↑ Invalidity%↓
DeepCAD 42.5 9.64 46.7 7.1
TransCAD 32.3 4.51 65.5 1.1
CAD-SIGNet 3.43 0.283 77.6 0.9
ReplayCAD 0.308 0.168 87.6 0.5

On real noisy scans (CC3D), ReplayCAD halves the Chamfer distance and doubles IoU versus CAD-SIGNet (Rukhovich et al., 2024).

5. Interpretability and Downstream Applications

Continual Anomaly Detection

The generative approach enables accurate replay of pixel-accurate normals, improving anomaly segmentation in challenging industrial use cases. A minimal storage cost (≈3 MB per dataset) yields substantial performance improvements and mitigates catastrophic forgetting, a key impediment in class-incremental workflows.

Editable CAD Reconstructions

The generated CAD scripts are human-readable, immediately supporting LLM-based CAD question answering (e.g., geometric queries) and interactive editing (e.g., refactoring code to expose design parameters tied to GUI sliders). For instance, post-generation modification via LLMs enables rapid parameterization workflows not available with vectorized or opaque format reconstructions.

A plausible implication is that ReplayCAD furnishes a direct bridge from unsupervised sensor data to end-to-end, editable CAD models—preserving not just geometry but procedural intent and affordances for interactive design.

6. Limitations and Future Prospects

In continual anomaly detection:

  • Domain shift between pre-trained LDMs and specific textures of industrial vision datasets remains an open issue. Extending to domain-adaptive priors or fine-tuning on limited in-domain samples is a candidate direction.
  • Reliance on mask extraction tools (e.g., SAM) introduces a dependency on mask quality; although ablations indicate robustness, a mask-free spatial conditioning approach may improve generality.
  • Diffusion sampling incurs non-trivial computational expense relative to feature replay; accelerations such as DDIM or latent-step reduction are under consideration.
  • Current implementation replays only "normal" images; extending to synthetic defect generation would enable semi-supervised continual anomaly segmentation (Hu et al., 10 May 2025).

In geometry-to-CAD replay:

  • Real-world applicability hinges on the expressivity of the DSL (CadQuery) and the fidelity of the procedural dataset.
  • Handling of intricate or heavily feature-based CAD sequences may require finer-grained primitives, hierarchical decoding, or integration of constraint solvers.

ReplayCAD, across both domains, represents a paradigmatic shift toward generative replay for knowledge retention, fine-grained modeling, and human-in-the-loop editability, with immediate impact in continual learning for vision and prompt-based CAD design workflows (Hu et al., 10 May 2025, Rukhovich et al., 2024).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (2)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to ReplayCAD.