Generative Poisoning Attack Method Against Neural Networks (1703.01340v1)

Published 3 Mar 2017 in cs.CR, cs.LG, and stat.ML

Abstract: Poisoning attack is identified as a severe security threat to machine learning algorithms. In many applications, for example, deep neural network (DNN) models collect public data as the inputs to perform re-training, where the input data can be poisoned. Although poisoning attack against support vector machines (SVM) has been extensively studied before, there is still very limited knowledge about how such attack can be implemented on neural networks (NN), especially DNNs. In this work, we first examine the possibility of applying traditional gradient-based method (named as the direct gradient method) to generate poisoned data against NNs by leveraging the gradient of the target model w.r.t. the normal data. We then propose a generative method to accelerate the generation rate of the poisoned data: an auto-encoder (generator) used to generate poisoned data is updated by a reward function of the loss, and the target NN model (discriminator) receives the poisoned data to calculate the loss w.r.t. the normal data. Our experiment results show that the generative method can speed up the poisoned data generation rate by up to 239.38x compared with the direct gradient method, with slightly lower model accuracy degradation. A countermeasure is also designed to detect such poisoning attack methods by checking the loss of the target model.

Citations (211)

View on Semantic Scholar

Summary

The paper presents a generative method using autoencoders that increases poisoning data generation efficiency by up to 239.38×.
It rigorously evaluates traditional gradient-based methods, revealing computational bottlenecks and trade-offs in attack potency.
The study introduces a loss-based detection strategy that offers an efficient countermeasure against adversarial poisoning in neural networks.

Generative Poisoning Attack Methods Against Neural Networks

The paper "Generative Poisoning Attack Method Against Neural Networks" introduces innovative paths in the research of adversarial attacks within neural network models, focusing particularly on the subset known as poisoning attacks. A poisoning attack represents a tangible threat to the integrity and robustness of neural networks by manipulating the training data. The authors, Chaofei Yang, Qing Wu, Hai Li, and Yiran Chen, contribute significant insights into this often underestimated yet pivotal area of machine learning security.

Core Contributions and Methodologies

The research specifically explores and expands on the methodology for executing poisoning attacks on Deep Neural Networks (DNNs), an area that has not been as extensively studied compared to Support Vector Machines (SVMs). The central claim of the paper is the examination and implementation of a generative model to facilitate the generation of poisoned data, thereby accelerating its synthesis. This is achieved through the development of an autoencoder as the generator, which couples with the target Neural Network (the discriminator) by using a loss-based reward function.

The primary contributions of the authors are threefold:

Direct Gradient Method Analysis: The authors undertake a robust evaluation of traditional gradient-based methods for generating poisoned data, identifying a key bottleneck in the process related to computation of second partial derivatives.
Introduction of a Generative Method: A novel generative approach is proposed, inspired by the framework of Generative Adversarial Networks (GANs). By employing an autoencoder, this method substantially improves the generation rate of poisoned data by up to 239.38× over the direct gradient methods while maintaining an acceptable compromise in terms of model accuracy degradation.
Countermeasure Design: In addition to attack methodologies, the paper devises a loss-based detection system. This countermeasure capitalizes on the variance in loss introduced by poisoned inputs, offering an efficient computational means to signal potential attacks without extensive overhead.

Empirical Evaluation

The methodologies developed were rigorously tested on the MNIST and CIFAR-10 datasets. Through these experiments, it was demonstrated that the generative method not only accelerates the poisoning data generation but also showcases the scalability potential for larger models. Despite a marginal decrease in attack potency compared to the direct gradient method, the generative method's drastic improvement in efficiency highlights its practical advantage, especially in real-world, time-sensitive applications.

Implications and Future Directions

This paper presents critical implications for both the defense and offense in neural network applications. Practically, the ability to generate poisoned data rapidly exemplifies a tangible threat that practitioners must consider. From a theoretical standpoint, it compels a deeper inspection into the resilience of neural networks against adversarial data manipulation, urging future research into more robust defense strategies.

Speculating on future research trajectories, architectural advancements in generators or alternative gradient policy designs could enhance both the speed and impact of such attacks. Furthermore, there exists ample room to explore the dynamics of autoencoder configurations within this context to maximize their generative utility.

Overall, the paper adds considerable depth to the broader discourse surrounding neural network security, appreciating how generative mechanisms can be leveraged for malicious intent. The ability to counteract such threats with sophisticated, loss-based detection mechanisms introduces a necessary evolution in safeguarding machine learning frameworks. The interplay of attack and defense elucidated in this research underscores the constant vigilance required in advancing AI technologies.