- The paper demonstrates that imperceptible adversarial perturbations, generated via PGD and frequency-domain methods, significantly degrade 3D reconstruction quality.
- It introduces both white-box attacks using gradient-based PGD and black-box attacks leveraging NES and CMA-ES in the low-frequency DCT domain.
- Empirical results reveal drastic drops in PSNR and SSIM metrics, exposing vulnerabilities in applications like robotics, autonomous driving, and content creation.
AdvSplat: Adversarial Attacks on Feed-Forward Gaussian Splatting Models
Introduction
Feed-forward 3D Gaussian Splatting (3DGS) models have advanced scalable, real-time, high-fidelity 3D reconstruction by transitioning from per-scene optimization to models that generalize across scenes via large-scale pretraining. As applications proliferate in robotics, autonomous driving, and content generation, these models are likely to be deployed widely in safety-critical and commercial environments. Despite their advantages, their neural architectures raise vulnerability concerns regarding adversarial manipulation, a subject previously unaddressed for feed-forward 3DGS. The "AdvSplat" paper systematically analyzes the adversarial robustness of feed-forward 3DGS, introducing attack algorithms targeting these models in both white-box and black-box settings (2603.23686).
Methods: Adversarial Attacks on Feed-Forward 3DGS
White-Box Attacks
The authors initially demonstrate that white-box adversarial attacks—specifically, Projected Gradient Descent (PGD)—can be directly applied to feed-forward 3DGS models by exploiting access to model gradients. By maximizing a combination of MSE and LPIPS losses between rendered and reference images through imperceptible ℓ∞​-bounded perturbations, they induce severe degradation in novel-view synthesis quality. Notably, the analysis reveals that perturbations nearly invisible to the human eye are sufficient to catastrophically disrupt 3D reconstruction across all test scenarios.
Figure 1: White-box attack results: even imperceptible input perturbations imposed via PGD can cause massive degradation in novel viewpoint renderings for feed-forward 3DGS models.
Black-Box Attacks: Frequency-Domain Optimization
Recognizing that white-box access is unrealistic in practice, the study focuses on black-box attacks with practical API-query settings. Classic transfer-based or random search attacks prove ineffective due to distinct model backbones and data distributions among large pre-trained feed-forward 3DGS models. Instead, AdvSplat introduces two query-efficient black-box attacks leveraging frequency-domain parameterization of perturbations:
- Gradient-Based (NES) Approach: Estimates the gradient through Natural Evolution Strategies by sampling low-frequency DCT components, optimizing perturbations in the DCT domain, which are subsequently applied in pixel space.
- Gradient-Free (CMA-ES) Approach: Employs a modified Covariance Matrix Adaptation Evolution Strategy, iteratively sampling and selecting perturbations in the low-frequency DCT space to maximize the adversarial loss.
Both frameworks drastically reduce the optimization dimensionality and query requirements by manipulating only the low-frequency DCT coefficients in local image blocks—empirically shown to contain information essential for successful attacks.
Figure 2: Pipeline of AdvSplat, including gradient-based and gradient-free black-box attacks operating in the DCT frequency domain.
Empirical Results
Experiments span two major datasets (RE10K and DL3DV) and three state-of-the-art feed-forward 3DGS models: DepthSplat (pose-known), NoPoSplat, and AnySplat (pose-free). AdvSplat consistently yields large reductions in reconstruction quality metrics (PSNR, SSIM, CLIP/DINO similarity), while LPIPS rises significantly, indicating perceptual distortion. Under a perturbation budget of ϵ=8/255, PSNR drops by up to 64% and SSIM by up to 59%, with LPIPS increasing by more than 150% for DepthSplat (RE10K). These trends hold across all models and datasets, regardless of attack variant.
Figure 3: Qualitative results on RE10K and DL3DV (ε = 8/255), revealing that adversarial attacks create substantial artifacts and color shifts in the renderings, while clean inputs remain visually plausible.
Black-box attacks are, in many cases, nearly as effective as white-box attacks, and the difference between gradient-based and gradient-free black-box methods is model-dependent. Notably, for models where gradient estimation is poor (e.g., NoPoSplat, trained on mixed data with high capacity), the gradient-free approach surpasses the gradient-based.
Frequency-Domain Parameterization Efficiency
The frequency-domain (DCT) approach accelerates attack convergence and achieves higher loss for a given query budget compared to pixel-space optimization, as shown in convergence plots. This effect is attributed to the reduction of the search space and the structural alignment of low-frequency perturbations with the features leveraged by feed-forward 3DGS models.
Figure 4: Loss curves comparing DCT-based and direct pixel-space approaches, showing superior attack efficiency when optimizations are constrained to the low-frequency domain.
Attack Strength Analysis
Varying the ℓ∞​ perturbation threshold quantitatively and qualitatively demonstrates a monotonic degradation in rendering quality. For strong attacks (ϵ=16/255), renderings collapse entirely, indicating a lack of inherent robustness in the current model class.
Figure 5: Qualitative outcomes at increasing attack strengths; catastrophic failure becomes visible at higher ϵ values.
Structural Effects
Visualization of the underlying 3D Gaussian primitives confirms that imperceptible input perturbations propagate through the network to substantially alter the parameters (color, opacity) of the rendered 3D structure, reinforcing the depth of the vulnerability.
Figure 6: Visualization of 3D Gaussian point clouds illustrating profound color and opacity deviations correlating with adversarial input perturbations.
Implications and Future Directions
AdvSplat challenges the assumption that feed-forward 3DGS models are robust by design, exposing them to imperceptible adversarial manipulations capable of destroying 3D reconstruction fidelity. The demonstrated black-box effectiveness raises immediate concerns for any application pipeline dependent on third-party input, including web-upload scenarios, simulation-to-real transfer in robotics, and visual mapping in autonomous systems.
Theoretical Implications:
- The results mark a critical security gap: current approaches to large-scale 3D generalization are susceptible to the same adversarial phenomena observed in 2D vision, even in the more complex novel-view synthesis setting.
- The non-transferability of adversarial perturbations between different 3DGS models suggests that architectural diversity or ensemble strategies may provide limited defense, but gradient-oblivious black-box strategies can still succeed.
Practical Implications:
- Attack queries can be instantiated via simple API calls with no access to model internals, emphasizing the need for robust defense mechanisms wherever inference APIs are exposed.
- Since very low-budget perturbations are effective, relying on trivial input filtering or anomaly detection at the pixel level is insufficient.
Future Directions:
- Research into robustifying 3DGS architectures through adversarial training, architectural regularization, or certified defenses analogous to progress in robust 2D vision is essential.
- The frequency-domain vulnerability of feed-forward 3DGS may motivate hybrid approaches that combine spectral regularization with adversarial defense.
- Understanding the transfer properties (or lack thereof) at the feature and geometry levels can inform models designed for both robustness and generalization.
Conclusion
AdvSplat introduces and exhaustively characterizes the adversarial vulnerability of feed-forward 3D Gaussian Splatting models in the context of generalizable 3D reconstruction. With efficient, query-restricted black-box attacks in the frequency domain, the study conclusively demonstrates that state-of-the-art models remain acutely susceptible to subtle, structured adversarial perturbations with devastating effects on 3D outputs. The findings underscore the pressing need for more robust and security-aware 3D representation learning, and open several new research frontiers for both theoretical and applied machine learning communities.