Papers

Topics

Authors

Recent

View all

Detailed Answer

Quick Answer

Concise responses based on abstracts only

Detailed Answer

Well-researched responses based on abstracts and relevant paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses

Gemini 2.5 Flash

Gemini 2.5 Flash 86 tok/s

Gemini 2.5 Pro 49 tok/s Pro

GPT-5 Medium 17 tok/s Pro

GPT-5 High 14 tok/s Pro

GPT-4o 88 tok/s Pro

GPT OSS 120B 471 tok/s Pro

Kimi K2 207 tok/s Pro

2000 character limit reached

Defense Against Adversarial Attacks Using Feature Scattering-based Adversarial Training (1907.10764v4)

Published 24 Jul 2019 in cs.CV, cs.CR, and cs.LG

Abstract: We introduce a feature scattering-based adversarial training approach for improving model robustness against adversarial attacks. Conventional adversarial training approaches leverage a supervised scheme (either targeted or non-targeted) in generating attacks for training, which typically suffer from issues such as label leaking as noted in recent works. Differently, the proposed approach generates adversarial images for training through feature scattering in the latent space, which is unsupervised in nature and avoids label leaking. More importantly, this new approach generates perturbed images in a collaborative fashion, taking the inter-sample relationships into consideration. We conduct analysis on model robustness and demonstrate the effectiveness of the proposed approach through extensively experiments on different datasets compared with state-of-the-art approaches.

Citations (224)

View on Semantic Scholar

Collections

Summary

The paper introduces a feature scattering-based adversarial training method that generates unsupervised, collaborative adversarial examples to overcome label leaking.
The paper leverages optimal transport to maximize the feature matching distance between clean and perturbed samples within a bilevel optimization framework.
Experimental results on CIFAR10, CIFAR100, and SVHN show significant improvements in adversarial robustness, with CIFAR10 accuracy increasing by 25.6% over previous methods.

Defense Against Adversarial Attacks Using Feature Scattering-based Adversarial Training

This paper presents an innovative feature scattering-based adversarial training method to enhance model robustness against adversarial attacks. Traditional adversarial training methods utilize a supervised scheme for generating adversarial samples, often encountering issues like label leaking. The proposed approach distinguishes itself by adopting an unsupervised methodology to generate adversarial images via feature scattering in the latent space, effectively circumventing the challenge of label leaking. Moreover, this approach emphasizes collaborative perturbation generation by considering inter-sample relationships, as opposed to treating each sample in isolation.

Main Contributions

The paper makes several contributions to improve adversarial training:

Novel Approach: It introduces a feature-scattering technique for creating adversarial images in an unsupervised, collaborative fashion. This method diverges from the traditional minimax formulation common in adversarial training.
Bilevel Optimization: The research explores an adversarial training formulation that fits within a broader category of bilevel optimization problems.
Robustness Analysis: Through extensive experimentation on various datasets, the paper analyzes the effectiveness of feature scattering in comparison to state-of-the-art adversarial training techniques.

Methodology

The feature scattering method relies on maximizing the feature matching distance between empirical distributions derived from clean and perturbed samples. The optimal transport (OT) distance serves as the metric for this comparison, leveraging ground features extracted from the data. This technique seeks to maintain the inter-sample structural integrity while generating adversarial perturbations, thus averting the pitfalls associated with label-guided adversarial examples that may deviate from the data manifold.

Experimental Results

The efficacy of the proposed approach is validated on benchmark datasets such as CIFAR10, CIFAR100, and SVHN:

On CIFAR10, the proposed method achieves a significant accuracy of 70.5% under a standard 20-step PGD attack, outperforming prior methods by notable margins (e.g., improving over the Madry method by 25.6%).
Experiments on CIFAR100 and SVHN further consolidate the robustness of the proposed approach, demonstrating substantial improvements in adversarial accuracy against white-box attacks compared to existing models.

Implications and Future Directions

The implications of this research are two-fold. Practically, the feature scattering method facilitates the training of models that are inherently more robust to adversarial attacks without incurring the time and computational resource penalties associated with traditional adversarial training iterations. Theoretically, it opens avenues for leveraging inter-sample features more effectively, encouraging exploration in collaborative perturbation techniques across machine learning domains.

Future research can further refine this unsupervised adversarial sample generation approach, potentially integrating other structural learning paradigms and exploring its applications in various domains beyond image classification, such as object detection and natural language processing. Additionally, investigating the theoretical bounds of adversarial robustness achievable through such collaborative methods can yield deeper insights into the limitations and capabilities of current adversarial defense strategies.

PDF Markdown

Paper Prompts

Explore 10 Community Prompts

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

Generate Now