Improving DNN Robustness to Adversarial Attacks using Jacobian Regularization (1803.08680v4)

Published 23 Mar 2018 in cs.LG, cs.CR, cs.CV, and stat.ML

Abstract: Deep neural networks have lately shown tremendous performance in various applications including vision and speech processing tasks. However, alongside their ability to perform these tasks with such high accuracy, it has been shown that they are highly susceptible to adversarial attacks: a small change in the input would cause the network to err with high confidence. This phenomenon exposes an inherent fault in these networks and their ability to generalize well. For this reason, providing robustness to adversarial attacks is an important challenge in networks training, which has led to extensive research. In this work, we suggest a theoretically inspired novel approach to improve the networks' robustness. Our method applies regularization using the Frobenius norm of the Jacobian of the network, which is applied as post-processing, after regular training has finished. We demonstrate empirically that it leads to enhanced robustness results with a minimal change in the original network's accuracy.

Citations (203)

View on Semantic Scholar

Summary

The paper introduces the use of the Frobenius norm of the Jacobian as a regularization term to significantly improve DNN robustness.
It validates the approach through experiments on MNIST, CIFAR-10, and CIFAR-100, showing notable gains in adversarial resistance under various attack methods.
The method offers a computationally efficient, post-processing technique that integrates easily with existing models and complements adversarial training.

Jacobian Regularization for Enhancing DNN Robustness to Adversarial Attacks

The vulnerability of deep neural networks (DNNs) to adversarial attacks has captured significant attention due to its implications for security and reliability in critical applications. A plethora of defensive strategies have emerged in response to these concerns, emphasizing either the detection of adversarial inputs or improving network robustness directly. This paper introduces a method focusing on the latter, specifically employing a novel application of Jacobian regularization to fortify the resistance of DNNs against adversarial perturbations.

The core proposition of the paper is the usage of the Frobenius norm of the network's Jacobian matrix as a regularization term. This regularization is applied as a post-processing step after the completion of traditional training. The rationale behind this approach lies in theoretical insights relating adversarial vulnerability to the geometry of decision boundaries and the generalization error of the network, which can be connected to the norm of the Jacobian.

Theoretical and Methodological Insights

The foundation for utilizing Jacobian regularization stems from the notion that large gradients of a network's classification function with respect to input data lead to susceptibility to adversarial perturbations. Regularizing these gradients, quantified via the Frobenius norm of the Jacobian, aims to minimize such vulnerabilities. This approach is rooted in recent theoretical work highlighting the connection between the Jacobian norm and a network’s generalization capabilities.

Further, the paper elucidates the relationship between adversarial perturbations and decision boundary curvature. Decision boundaries characterized by positive curvature have been linked with increased vulnerability to small, universal perturbations. By reducing the Frobenius norm of the Jacobian, the paper posits a corresponding reduction in decision boundary curvature, enhancing robustness.

Empirical Validation

Experiments conducted on MNIST, CIFAR-10, and CIFAR-100 datasets validate the effectiveness of the proposed method. The robustness metric $\hat{\rho}_{adv}$ , representing the average proportion of minimal perturbation required to fool the network, shows substantial improvement under the DeepFool, FGSM, and JSMA attacks for models incorporating Jacobian regularization. Notably, Jacobian regularization alone surpasses other strategies like adversarial training and Cross-Lipschitz regularization in many scenarios. When combined with adversarial training, further performance gains are realized, underscoring the complementary potential of these techniques.

Practical Implications and Computational Efficiency

A key advantage of the proposed method lies in its computational efficiency. The post-processing training phase incurs the overhead of merely one additional back-propagation step per iteration. This contrasts with other techniques such as Defensive Distillation, which are considerably more resource-intensive. The ease of integration with pre-trained models also highlights its practical applicability in enhancing existing systems without substantial computational cost.

Implications and Future Directions

The proposed Jacobian regularization offers a theoretically grounded, efficient means of bolstering DNNs against adversarial attacks. Its empirical success in diverse benchmarks suggests promising practical applications, especially in areas where model integrity is paramount.

Future research could explore augmentations to this framework, such as adaptive regularization weights for different Jacobian rows or alternate norm choices tailored to specific adversarial metrics. Investigating the interplay between Jacobian regularization and data augmentation strategies could yield further insights into optimizing robustness.

In conclusion, the theoretical underpinning and empirical verification of Jacobian regularization underscore its potential as an effective component in the defense against adversarial attacks, setting a promising trajectory for future explorations in DNN robustness enhancement methodologies.

PDF Markdown