Adversarial Attack on Graph Structured Data (1806.02371v1)

Published 6 Jun 2018 in cs.LG, cs.CR, cs.SI, and stat.ML

Abstract: Deep learning on graph structures has shown exciting results in various applications. However, few attentions have been paid to the robustness of such models, in contrast to numerous research work for image or text adversarial attack and defense. In this paper, we focus on the adversarial attacks that fool the model by modifying the combinatorial structure of data. We first propose a reinforcement learning based attack method that learns the generalizable attack policy, while only requiring prediction labels from the target classifier. Also, variants of genetic algorithms and gradient methods are presented in the scenario where prediction confidence or gradients are available. We use both synthetic and real-world data to show that, a family of Graph Neural Network models are vulnerable to these attacks, in both graph-level and node-level classification tasks. We also show such attacks can be used to diagnose the learned classifiers.

Citations (717)

View on Semantic Scholar

Summary

The paper introduces various adversarial attack strategies that manipulate graph structures to deceive GNN classifiers.
It employs reinforcement learning, gradient-based optimization, and genetic algorithms to tailor attacks for both black-box and white-box settings.
Experiments reveal significant drops in model accuracy, underscoring the need for robust defense mechanisms in critical applications.

Adversarial Attack on Graph Structured Data

Overview

The paper "Adversarial Attack on Graph Structured Data" by Hanjun Dai et al. focuses on the vulnerability of Graph Neural Networks (GNNs) to adversarial attacks. While deep learning models in the domains of image and text recognition have been extensively studied and fortified against adversarial attacks, this investigation targets the largely unexplored area of graph-structured data. The paper proposes several methods to conduct adversarial attacks, evaluates their effectiveness, and discusses potential defenses.

Methods

The paper introduces multiple attack strategies tailored to different levels of access to the target model, each capable of manipulating the graph structure to deceive the GNN classifier. These methods are summarized below:

Reinforcement Learning-Based Attack (RL-S2V):
- Utilizes reinforcement learning to devise a policy for modifying the graph structure based solely on prediction labels from the target classifier.
- This approach is adaptable to various levels of access, including black-box attacks where only discrete prediction labels are accessible.
- The hierarchical Q-function decomposes the combinatorial action space, making training feasible for large graphs.
Gradient-Based Attack (GradArgmax):
- Applicable when gradients from the target classifier are accessible.
- Leverages gradient ascent to identify and modify the most influential edges, based on gradient magnitudes.
- Suitable for white-box scenarios where comprehensive model information is available.
Genetic Algorithm-Based Attack (GeneticAlg):
- Employs evolutionary computing to iteratively improve a population of graph modifications.
- Evaluates the fitness of each modification via the model's loss function, iteratively refining through selection, crossover, and mutation.
- Effective in practical black-box scenarios where prediction confidence scores are available.
Random Sampling (RandSampling):
- The simplest method, randomly adding or deleting edges constrained by semantic equivalency.
- Requires minimal information from the target model and serves as a baseline for more sophisticated attacks.

Experimental Evaluation

The authors conduct extensive experiments using both synthetic and real-world datasets to validate the effectiveness of the proposed attacks. The primary findings are:

GNN models demonstrate significant vulnerability to adversarial attacks across various graph classification and node classification tasks.
For graph classification, the target classifier's accuracy significantly drops under practical black-box and white-box attacks.
Node classification models, tested on bibliographic and transactional graphs, also show substantial drops in performance when subjected to edge modifications, even with minimal changes to the graph structure.
The proposed RL-S2V method not only successfully attacks models in the trained environment but also maintains its efficacy in the restricted black-box setting, highlighting the transferability of the learned attack policy.

Implications

The findings suggest that the robustness of GNNs in financial, security, and other critical applications cannot be taken for granted. The adversarial susceptibility reveals implicit flaws in the model's ability to generalize, raising concerns about deploying these models in adversarial environments. This calls for stronger, more robust learning frameworks and effective defense mechanisms.

Future Directions

The paper opens up several avenues for future research:

Enhanced Defense Mechanisms:
- Development of advanced adversarial training techniques beyond random edge drop, incorporating effective detection and mitigation strategies for adversarial manipulations.
Transferable Adversarial Policies:
- Further exploration of transferability in adversarial attacks, potentially developing attack models that generalize across different datasets and GNN architectures.
Interpretability and Robustness:
- Improving the interpretability of GNNs to understand and mitigate the pathways through which these models can be compromised.

In conclusion, this paper significantly advances the understanding of adversarial vulnerabilities in GNNs, providing critical insights into the design of more resilient models and paving the way for future research in this domain.

PDF Markdown