An Evaluation of Adversarially Robust Distillation (ARD) Techniques
This paper provides a comprehensive evaluation of Adversarially Robust Distillation (ARD) techniques, primarily focusing on the efficiency of student-teacher models and the implications of various hyperparameters on the performance trade-offs involving natural and robust accuracies. The authors scrutinize the ARD process utilizing different neural network architectures as teacher-student pairs, such as the WideResNet and ResNet18 teacher models and the MobileNetV2 student model.
In detail, the research critically assesses the space and time efficiency of these deep learning models. The ResNet18 and WideResNet include approximately $11.2$ million and $46.2$ million parameters, respectively, while the MobileNetV2 student model is significantly more compact with $2.3$ million parameters. Performance efficiency was measured using multiply-add (MAdd) operations, where a notable finding was the substantial decrease to 1.4% of operations during a forward pass in comparison to the WideResNet.
The paper deeply investigates the effects of temperature and α parameters on knowledge distillation outcomes with respect to robustness trade-offs. The paper reveals that while the temperature parameter has minimal effect on robustness, the reduction in α leads to a rapid decline in robustness, indicating an accuracy-robustness tradeoff particularly impactful for low α values. Different approaches were considered for data augmentation, with findings suggesting that basic augmentation strategies such as horizontal flips and random cropping were effective for robustness without incurring the drawbacks associated with adversarial point teacher behavior training.
Multiple experimental tables included demonstrate the nuanced impacts of hyperparameters adjustment, such as temperature and α, on the performance of knowledge distillation. Additional analysis evaluates other ARD technical configurations. For instance, including a knowledge distillation $\KL$ divergence term degraded robust accuracy, and substituting Tt(xi′) for Tt(xi) in the loss function was found detrimental to overall accuracy.
Moreover, naturally trained teacher models under ARD conditions still produced students with some robustness, albeit lower than those distilled from adversarially trained counterparts. Further, the acceleration of ARD training was explored by lessening the number of attack steps, which enhanced natural accuracy but slightly dropped robustness, supporting the consideration of strategic trade-offs in ARD deployments.
Practically and theoretically, these findings contribute to the ongoing discourse in adversarial training by revealing the intricacies of ARD's hyperparameter sensitivity and operational efficiency. The implications of this work suggest potential refinements in robust model training protocols and highlight the balance between computational cost and accuracy in practice. Future directions might involve adaptive data augmentation tailored for robustness, enhancing the applicability of ARD strategies in real-world scenarios.