Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

When and How to Fool Explainable Models (and Humans) with Adversarial Examples (2107.01943v2)

Published 5 Jul 2021 in cs.LG and cs.CR

Abstract: Reliable deployment of machine learning models such as neural networks continues to be challenging due to several limitations. Some of the main shortcomings are the lack of interpretability and the lack of robustness against adversarial examples or out-of-distribution inputs. In this exploratory review, we explore the possibilities and limits of adversarial attacks for explainable machine learning models. First, we extend the notion of adversarial examples to fit in explainable machine learning scenarios, in which the inputs, the output classifications and the explanations of the model's decisions are assessed by humans. Next, we propose a comprehensive framework to study whether (and how) adversarial examples can be generated for explainable models under human assessment, introducing and illustrating novel attack paradigms. In particular, our framework considers a wide range of relevant yet often ignored factors such as the type of problem, the user expertise or the objective of the explanations, in order to identify the attack strategies that should be adopted in each scenario to successfully deceive the model (and the human). The intention of these contributions is to serve as a basis for a more rigorous and realistic study of adversarial examples in the field of explainable machine learning.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (148)
  1. Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI). IEEE Access, 6:52138–52160.
  2. Word Sense Disambiguation: Algorithms and Applications, volume 33 of Text, Speech and Language Technology. Springer.
  3. Fairwashing: The Risk of Rationalization. In Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, pages 161–170. PMLR.
  4. Skin Lesion Segmentation in Dermoscopy Images via Deep Full Resolution Convolutional Networks. Computer Methods and Programs in Biomedicine, 162:221–231.
  5. Towards Robust Interpretability with Self-Explaining Neural Networks. In Advances in Neural Information Processing Systems, volume 31, pages 7775–7784.
  6. On the Robustness of Interpretability Methods. In Proceedings of the 2018 ICML Workshop on Human Interpretability in Machine Learning (WHI 2018), pages 66–71.
  7. GenAttack: Practical Black-Box Attacks with Gradient-Free Optimization. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO), GECCO ’19, pages 1111–1119. Association for Computing Machinery.
  8. Finding and Removing Clever Hans: Using Explanation Methods to Debug and Improve Deep Models. Information Fusion, 77:261–295.
  9. Vulnerability Analysis of Chest X-Ray Image Classification Against Adversarial Attacks. In Understanding and Interpreting Machine Learning in Medical Image Computing Applications, Lecture Notes in Computer Science, pages 87–94. Springer International Publishing.
  10. Adversarial Metric Attack and Defense for Person Re-Identification. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(6):2119–2126.
  11. Perturbation Analysis of Learning Algorithms: Generation of Adversarial Examples From Classification to Regression. IEEE Transactions on Signal Processing, 67(23):6078–6091.
  12. Imperceptible Adversarial Attacks on Tabular Data. In NeurIPS 2019 Workshop on Robust AI in Financial Services: Data, Fairness, Explainability, Trustworthiness, and Privacy (Robust AI in FS).
  13. SAM: The Sensitivity of Attribution Methods to Hyperparameters. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 8670–8680. IEEE Computer Society.
  14. Representation Problems in Linguistic Annotations: Ambiguity, Variation, Uncertainty, Error and Bias. In Proceedings of the 14th Linguistic Annotation Workshop, pages 60–73.
  15. Synthetic and Natural Noise Both Break Neural Machine Translation. In International Conference on Learning Representations (ICLR).
  16. Simple Transparent Adversarial Examples. In ICLR 2021 Workshop on Security and Safety in Machine Learning Systems.
  17. Adversarial Attack Vulnerability of Medical Image Analysis Systems: Unexplored Factors. Medical Image Analysis, 73:102141.
  18. Decision-Based Adversarial Attacks: Reliable Attacks Against Black-Box Machine Learning Models. In International Conference on Learning Representations.
  19. The Role of Explanations on Trust and Reliance in Clinical Decision Support Systems. In Proceedings of the 2015 International Conference on Healthcare Informatics (ICHI), pages 160–169.
  20. Towards Evaluating the Robustness of Neural Networks. In Proceedings of the 2017 IEEE Symposium on Security and Privacy (SP), pages 39–57.
  21. Adversarial Attacks for Tabular Data: Application to Fraud Detection and Imbalanced Data. In Proceedings of the 2021 AAAI Workshop on Artificial Intelligence Safety (SafeAI).
  22. This Looks Like That: Deep Learning for Interpretable Image Recognition. In Advances in Neural Information Processing Systems, volume 32, pages 8930–8941.
  23. ZOO: Zeroth Order Optimization Based Black-Box Attacks to Deep Neural Networks without Training Substitute Models. In Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security (AISec), pages 15–26. Association for Computing Machinery.
  24. Concept Whitening for Interpretable Image Recognition. Nature Machine Intelligence, 2(12):772–782.
  25. Seq2Sick: Evaluating the Robustness of Sequence-to-Sequence Models with Adversarial Examples. Proceedings of the AAAI Conference on Artificial Intelligence, 34(04):3601–3608.
  26. Robust Neural Machine Translation with Doubly Adversarial Inputs. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 4324–4333. Association for Computational Linguistics.
  27. Houdini: Fooling Deep Structured Visual and Speech Recognition Models with Adversarial Examples. In Advances in Neural Information Processing Systems, volume 30, pages 6977–6987. Curran Associates, Inc.
  28. Attacking the Dialogue System at Smart Home. In Proceedings of the International Conference on Collaborative Computing: Networking, Applications and Worksharing, Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, pages 148–158. Springer International Publishing.
  29. ImageNet: A Large-Scale Hierarchical Image Database. In Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 248–255.
  30. Explanations Can Be Manipulated and Geometry Is to Blame. In Advances in Neural Information Processing Systems, volume 32, pages 13589–13600.
  31. Considerations for Evaluation and Generalization in Interpretable Machine Learning. In Explainable and Interpretable Models in Computer Vision and Machine Learning, The Springer Series on Challenges in Machine Learning, pages 3–17.
  32. On Adversarial Examples for Character-Level Neural Machine Translation. In Proceedings of the 27th International Conference on Computational Linguistics (COLING), pages 653–663. Association for Computational Linguistics.
  33. Explaining Classifiers Using Adversarial Perturbations on the Perceptual Ball. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 10693–10702.
  34. On the Connection Between Adversarial Robustness and Saliency Map Interpretability. In Proceedings of the 36th International Conference on Machine Learning (ICML), volume 97 of Proceedings of Machine Learning Research, pages 1823–1832.
  35. Robust Physical-World Attacks on Deep Learning Visual Classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 1625–1634.
  36. Adversarial Attacks on Medical Machine Learning. Science, 363(6433):1287–1289.
  37. Adversarial Examples for Semantic Image Segmentation. In Workshop of the 2017 International Conference on Learning Representations (ICLR).
  38. Deep Learning-Based Image Recognition for Autonomous Driving. IATSS Research, 43(4):244–252.
  39. Adversarial Attacks on Deep Models for Financial Transaction Records. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, KDD ’21, pages 2868–2878. Association for Computing Machinery.
  40. The False Hope of Current Approaches to Explainable Artificial Intelligence in Health Care. The Lancet Digital Health, 3(11):e745–e750.
  41. Interpretation of Neural Networks Is Fragile. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, pages 3681–3688.
  42. Towards Automatic Concept-Based Explanations. In Advances in Neural Information Processing Systems, volume 32, pages 9277–9286.
  43. Explaining Explanations: An Overview of Interpretability of Machine Learning. In 2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA), pages 80–89.
  44. Automation Bias: A Systematic Review of Frequency, Effect Mediators, and Mitigators. Journal of the American Medical Informatics Association, 19(1):121–127.
  45. Explaining and Harnessing Adversarial Examples. In International Conference on Learning Representations (ICLR).
  46. Adversarial Machine Learning on Social Network: A Survey. Frontiers in Physics, 9.
  47. An Adversarial Attacker for Neural Networks in Regression Problems. In IJCAI Workshop on Artificial Intelligence Safety (AI Safety).
  48. Explaining Image Misclassification in Deep Learning via Adversarial Examples. In Proceedings of the International Conference on Modeling Decisions for Artificial Intelligence (MDAI), Lecture Notes in Computer Science, pages 323–334. Springer International Publishing.
  49. Interpretable Image Recognition with Hierarchical Prototypes. In Proceedings of the AAAI Conference on Human Computation and Crowdsourcing, volume 7, pages 32–40.
  50. Fooling Neural Network Interpretations via Adversarial Model Manipulation. In Advances in Neural Information Processing Systems, volume 32, pages 2925–2936.
  51. Universal Adversarial Attacks on Deep Neural Networks for Medical Image Classification. BMC Medical Imaging, 21(1):1–13.
  52. Explain2Attack: Text Adversarial Attacks via Cross-Domain Interpretability. In Proceedings of the 25th International Conference on Pattern Recognition (ICPR), pages 8922–8928. IEEE.
  53. CopyCAT: Taking Control of Neural Policies with Constant Attacks. In Proceedings of the 19th International Conference on Autonomous Agents and MultiAgent Systems, AAMAS ’20, pages 548–556. International Foundation for Autonomous Agents and Multiagent Systems.
  54. On Relating Explanations and Adversarial Examples. In Advances in Neural Information Processing Systems, volume 32, pages 15883–15893.
  55. Black-Box Adversarial Attacks with Limited Queries and Information. In Proceedings of the 35th International Conference on Machine Learning (ICML), volume 80, pages 2137–2146. PMLR.
  56. Interpretability-Guided Defense Against Backdoor Attacks to Deep Neural Networks. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
  57. Using Adversarial Images to Assess the Robustness of Deep Learning Models Trained on Diagnostic Images in Oncology. JCO Clinical Cancer Informatics, 6:e2100170.
  58. Johnson, S. L. J. (2019). AI, Machine Learning, and Ethics in Health Care. Journal of Legal Medicine, 39(4):427–441.
  59. Novel Dataset for Fine-Grained Image Categorization: Stanford Dogs. in: CVPR Workshop on Fine-Grained Visual Categorization (FGVC).
  60. Art of Singular Vectors and Universal Adversarial Perturbations. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 8562–8570.
  61. Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV). In Proceedings of the 35th International Conference on Machine Learning (ICML), volume 80 of Proceedings of Machine Learning Research, pages 2668–2677.
  62. The (Un)reliability of Saliency Methods. In Explainable AI: Interpreting, Explaining and Visualizing Deep Learning, volume 11700 of Lecture Notes in Computer Science, pages 267–280.
  63. Understanding Black-Box Predictions via Influence Functions. In Proceedings of the 34th International Conference on Machine Learning, volume 70 of Proceedings of Machine Learning Research, pages 1885–1894. PMLR.
  64. Adversarial Examples for Generative Models. In Proceedings of the 2018 IEEE Security and Privacy Workshops (SPW), pages 36–42.
  65. Adversarial Machine Learning for Spam Filters. In Proceedings of the 15th International Conference on Availability, Reliability and Security, ARES ’20, pages 1–6. Association for Computing Machinery.
  66. Evolutionary Adversarial Attacks on Payment Systems. In Proceedings of the 20th IEEE International Conference on Machine Learning and Applications (ICMLA), pages 813–818.
  67. Black Box Attacks on Explainable Artificial Intelligence(XAI) methods in Cyber Security. In 2020 International Joint Conference on Neural Networks (IJCNN), pages 1–8.
  68. ”How do I fool you?”: Manipulating User Trust via Misleading Black Box Explanations. In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, AIES ’20, pages 79–85.
  69. Faithful and Customizable Explanations of Black Box Models. In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society, AIES ’19, pages 131–138. Association for Computing Machinery.
  70. Deep Learning for Case-Based Reasoning Through Prototypes: A Neural Network That Explains Its Predictions. Proceedings of the AAAI Conference on Artificial Intelligence, 32(1):3530–3537.
  71. Robust Detection of Adversarial Attacks on Medical Images. In Proceedings of the 17th IEEE International Symposium on Biomedical Imaging (ISBI), pages 1154–1158.
  72. Anatomical Context Protects Deep Learning From Adversarial Perturbations in Medical Imaging. Neurocomputing, 379:370–378.
  73. Tactics of Adversarial Attack on Deep Reinforcement Learning Agents. In Proceedings of the 26th International Joint Conference on Artificial Intelligence (IJCAI), IJCAI’17, pages 3756–3762. AAAI Press.
  74. Lipton, Z. C. (2018). The Mythos of Model Interpretability: In Machine Learning, the Concept of Interpretability Is Both Important and Slippery. Queue, 16(3):31–57.
  75. Adversarial Attacks and Defenses: An Interpretation Perspective. ACM SIGKDD Explorations Newsletter, 23(1):86–99.
  76. Adversarial Detection with Model Interpretation. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD ’18, pages 1803–1811. Association for Computing Machinery.
  77. Delving into Transferable Adversarial Examples and Black-Box Attacks. In International Conference on Learning Representations (ICLR).
  78. Understanding Adversarial Attacks on Deep Learning Based Medical Image Analysis Systems. Pattern Recognition, 110:107332.
  79. Towards Deep Learning Models Resistant to Adversarial Attacks. In International Conference on Learning Representations (ICLR).
  80. Application of Deep Learning to Cybersecurity: A Survey. Neurocomputing, 347:149–176.
  81. Not All Datasets Are Born Equal: On Heterogeneous Tabular Data and Adversarial Examples. Knowledge-Based Systems, 242:108377.
  82. Universal Adversarial Perturbations Against Semantic Image Segmentation. In 2017 IEEE International Conference on Computer Vision (ICCV), pages 2774–2783. IEEE.
  83. On Evaluation of Adversarial Perturbations for Sequence-to-Sequence Models. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT), volume 1, pages 3103–3114. Association for Computational Linguistics.
  84. Adversarial Examples in Deep Learning for Multivariate Time Series Regression. In 2020 IEEE Applied Imagery Pattern Recognition Workshop (AIPR), pages 1–10.
  85. Explaining Deep Learning Models with Constrained Adversarial Examples. In PRICAI 2019: Trends in Artificial Intelligence, Lecture Notes in Computer Science, pages 43–56. Springer International Publishing.
  86. Universal Adversarial Perturbations. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 86–94.
  87. DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 2574–2582.
  88. Generalizable Data-Free Objective for Crafting Universal Adversarial Perturbations. IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(10):2452–2465.
  89. Fast Feature Fool: A Data Independent Approach to Universal Adversarial Perturbations. In Proceedings of the British Machine Vision Conference 2017 (BMVC), pages 30.1–30.12. BMVA Press.
  90. Ask, Acquire, and Attack: Data-Free UAP Generation Using Class Impressions. In Proceedings of the European Conference on Computer Vision (ECCV), Lecture Notes in Computer Science, pages 20–35. Springer International Publishing.
  91. Visual Explanation by Attention Branch Network for End-to-End Learning-based Self-Driving. In Proceedings of the 2019 IEEE Intelligent Vehicles Symposium (IV), pages 1577–1582.
  92. The Effectiveness of Feature Attribution Methods and Its Correlation With Automatic Evaluation Scores. In Advances in Neural Information Processing Systems, volume 34, pages 26422–26436.
  93. An Empirical Study on the Relation Between Network Interpretability and Adversarial Robustness. SN Computer Science, 2(1).
  94. Practical Black-Box Attacks against Machine Learning. In Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security, ASIA CCS ’17, pages 506–519. Association for Computing Machinery.
  95. Generalizability vs. Robustness: Investigating Medical Imaging Networks Using Adversarial Examples. In Proceedings of the 2018 International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), volume 11070 of Lecture Notes in Computer Science, pages 493–501. Springer International Publishing.
  96. Mitigating Adversarial Attacks on Medical Image Understanding Systems. In Proceedings of the 17th IEEE International Symposium on Biomedical Imaging (ISBI), pages 1517–1521.
  97. Review of Machine Learning Techniques in Health Care. In Proceedings of the 2019 International Conference on Recent Innovations in Computing (ICRIC), Lecture Notes in Electrical Engineering, pages 103–111. Springer International Publishing.
  98. Generative Adversarial Perturbations. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 4422–4431. IEEE.
  99. On the Veracity of Local, Model-Agnostic Explanations in Audio Classification: Targeted Investigations With Adversarial Examples. In Proceedings of the 22nd International Society for Music Information Retrieval Conference (ISMIR), pages 531–538.
  100. Black-Box Adversarial Attacks using Evolution Strategies. In Proceedings of the Genetic and Evolutionary Computation Conference Companion, GECCO ’21, pages 1827–1833. Association for Computing Machinery.
  101. Adversarial Examples—Security Threats to COVID-19 Deep Learning Systems in Medical IoT Devices. IEEE Internet of Things Journal, 8(12):9603–9610.
  102. Explanation Methods in Deep Learning: Users, Values, Concerns and Challenges. In Explainable and Interpretable Models in Computer Vision and Machine Learning, The Springer Series on Challenges in Machine Learning, pages 19–36.
  103. Detecting Potential Local Adversarial Examples for Human-Interpretable Defense. In Proceedings of the 2018 ECML PKDD Workshop on Recent Advances in Adversarial Machine Learning, Lecture Notes in Computer Science, pages 41–47. Springer International Publishing.
  104. Improving the Adversarial Robustness and Interpretability of Deep Neural Networks by Regularizing Their Input Gradients. Proceedings of the AAAI Conference on Artificial Intelligence, 32(1):1660–1669.
  105. Rudin, C. (2019). Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable Models Instead. Nature Machine Intelligence, 1(5):206–215.
  106. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision, 115(3):211–252.
  107. Classification-by-Components: Probabilistic Modeling of Reasoning over a Set of Components. In Advances in Neural Information Processing Systems, volume 32, pages 2792–2803.
  108. Robust Classification of Financial Risk. In NIPS 2018 Workshop on Challenges and Opportunities for AI in Financial Services: The Impact of Fairness, Explainability, Accuracy, and Privacy.
  109. Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), pages 618–626.
  110. Deep Learning Models for Predictive Maintenance: A Survey, Comparison, Challenges and Prospects. Applied Intelligence.
  111. Accessorize to a Crime: Real and Stealthy Attacks on State-of-the-Art Face Recognition. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, CCS ’16, pages 1528–1540. Association for Computing Machinery.
  112. A Survey of Hierarchical Classification Across Different Application Domains. Data Mining and Knowledge Discovery, 22(1):31–72.
  113. Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps. In Workshop of the 2014 International Conference on Learning Representations (ICLR).
  114. Fooling LIME and SHAP: Adversarial Attacks on Post hoc Explanation Methods. In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, pages 180–186.
  115. Counterfactual Explanations of Machine Learning Predictions: Opportunities and Challenges for AI Safety. In Proceedings of the 2019 AAAI Workshop on Artificial Intelligence Safety (SafeAI), pages 95–99.
  116. Interpretability of Machine Learning-Based Prediction Models in Healthcare. WIREs Data Mining and Knowledge Discovery, 10(5):e1379.
  117. ConvNets and ImageNet Beyond Accuracy: Understanding Mistakes and Uncovering Biases. In Computer Vision – ECCV 2018, volume 11210 of Lecture Notes in Computer Science, pages 504–519. Springer International Publishing.
  118. Intriguing Properties of Neural Networks. In International Conference on Learning Representations (ICLR).
  119. Adversarial Images for Variational Autoencoders. In NIPS 2016 Workshop on Adversarial Training.
  120. Attacks Meet Interpretability: Attribute-Steered Detection of Adversarial Samples. In Advances in Neural Information Processing Systems, volume 31, pages 7717–7728. Curran Associates, Inc.
  121. Fooling Automated Surveillance Cameras: Adversarial Patches to Attack Person Detection. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pages 49–55.
  122. From ImageNet to Image Classification: Contextualizing Progress on Benchmarks. In Proceedings of the 37th International Conference on Machine Learning (ICML), volume 119 of Proceedings of Machine Learning Research, pages 9625–9635.
  123. Robustness May Be at Odds with Accuracy. In International Conference on Learning Representations (ICLR).
  124. Actionable Recourse in Linear Classification. In Proceedings of the Conference on Fairness, Accountability, and Transparency, FAT* ’19, pages 10–19. Association for Computing Machinery.
  125. Evaluating XAI: A Comparison of Rule-Based and Example-Based Explanations. Artificial Intelligence, 291:103404.
  126. Explainable Security. In Proceedings of the 2020 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW), pages 293–300.
  127. Counterfactual Explanations Without Opening the Black Box: Automated Decisions and the GDPR. Harvard Journal of Law & Technology, 31(2):842–887.
  128. Transferable, Controllable, and Inconspicuous Adversarial Attacks on Person Re-identification With Deep Mis-Ranking. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 342–351.
  129. Gradient-Based Analysis of NLP Models is Manipulable. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 247–258. Association for Computational Linguistics.
  130. Interpretability is a Kind of Safety: An Interpreter-Based Ensemble for Adversary Defense. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD ’20, pages 15–24. Association for Computing Machinery.
  131. COVID-Net: A Tailored Deep Convolutional Neural Network Design for Detection of COVID-19 Cases From Chest X-Ray Images. Scientific Reports, 10(1):19549.
  132. Adversarial Examples for Semantic Segmentation and Object Detection. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), pages 1378–1387.
  133. Adversarial T-Shirt! Evading Person Detectors in a Physical World. In Proceedings of the 2020 European Conference on Computer Vision (ECCV), Lecture Notes in Computer Science, pages 665–681. Springer International Publishing.
  134. DPAEG: A Dependency Parse-Based Adversarial Examples Generation Method for Intelligent Q&A Robots. Security and Communication Networks, 2020.
  135. ML-LOO: Detecting Adversarial Examples with Feature Attribution. Proceedings of the AAAI Conference on Artificial Intelligence, 34(04):6639–6647.
  136. Outcomes of Adversarial Attacks on Deep Learning Models for Ophthalmology Imaging Domains. JAMA Ophthalmology, 138(11):1213–1215.
  137. Understanding Neural Networks Through Deep Visualization. In 2015 ICML Workshop on Deep Learning.
  138. Adversarial Examples: Attacks and Defenses for Deep Learning. IEEE Transactions on Neural Networks and Learning Systems, 30(9):2805–2824.
  139. Visualizing and Understanding Convolutional Networks. In Computer Vision – ECCV 2014, volume 8689 of Lecture Notes in Computer Science, pages 818–833.
  140. Detecting Adversarial Perturbations with Saliency. In Proceedings of the IEEE 3rd International Conference on Signal and Image Processing (ICSIP), pages 271–275.
  141. Interpretable Convolutional Neural Networks. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 8827–8836.
  142. Interpreting Adversarially Trained Convolutional Neural Networks. In Proceedings of the 36th International Conference on Machine Learning (ICML), volume 97 of Proceedings of Machine Learning Research, pages 7502–7511.
  143. Interpretable Deep Learning under Fire. In Proceedings of the 29th USENIX Security Symposium (USENIX Security 20), pages 1659–1676.
  144. A Survey on Neural Network Interpretability. IEEE Transactions on Emerging Topics in Computational Intelligence, 5(5):726–742.
  145. Generating Natural Adversarial Examples. In International Conference on Learning Representations (ICLR).
  146. Analyzing the Interpretability Robustness of Self-Explaining Models. In ICML 2019 Security and Privacy of Machine Learning Workshop.
  147. An Effective Adversarial Attack on Person Re-Identification in Video Surveillance via Dispersion Reduction. IEEE Access, 8:183891–183902.
  148. A Reinforced Generation of Adversarial Examples for Neural Machine Translation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 3486–3497. Association for Computational Linguistics.
Citations (7)

Summary

We haven't generated a summary for this paper yet.