Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
126 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

IBD-PSC: Input-level Backdoor Detection via Parameter-oriented Scaling Consistency (2405.09786v3)

Published 16 May 2024 in cs.LG and cs.CR

Abstract: Deep neural networks (DNNs) are vulnerable to backdoor attacks, where adversaries can maliciously trigger model misclassifications by implanting a hidden backdoor during model training. This paper proposes a simple yet effective input-level backdoor detection (dubbed IBD-PSC) as a `firewall' to filter out malicious testing images. Our method is motivated by an intriguing phenomenon, i.e., parameter-oriented scaling consistency (PSC), where the prediction confidences of poisoned samples are significantly more consistent than those of benign ones when amplifying model parameters. In particular, we provide theoretical analysis to safeguard the foundations of the PSC phenomenon. We also design an adaptive method to select BN layers to scale up for effective detection. Extensive experiments are conducted on benchmark datasets, verifying the effectiveness and efficiency of our IBD-PSC method and its resistance to adaptive attacks. Codes are available at \href{https://github.com/THUYimingLi/BackdoorBox}{BackdoorBox}.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (59)
  1. Targeted Attack against Deep Neural Networks via Flipping Limited Weight Bits. In ICLR, 2021.
  2. Detecting backdoor attacks on deep neural networks by activation clustering. In CEUR Workshop, 2018.
  3. Effective backdoor defense by exploiting sensitivity of poisoned samples. In NeurIPS, 2022.
  4. Targeted Backdoor Attacks on Deep Learning Systems Using Data Poisoning. arXiv, 2017.
  5. SentiNet: Detecting physical attacks against deep learning systems. In IEEE S&P Workshop, 2020.
  6. ImageNet: A large-scale hierarchical image database. In CVPR, 2009.
  7. Backdoor Attack with Imperceptible Input and Latent Modification. In NeurIPS, 2021.
  8. Design and Evaluation of a Multi-Domain Trojan Detection Method on Deep Neural Networks. TDSC, 2021.
  9. Kaleidoscope: Physical Backdoor Attacks against Deep Neural Networks with RGB Filters. TDSC, 2023.
  10. BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain. IEEE Access, 2017.
  11. Gaussian mixture solvers for diffusion models. In NeurIPS, 2024.
  12. PolicyCleanse: Backdoor Detection and Mitigation in Reinforcement Learning. In ICCV, 2023a.
  13. SCALE-UP: An efficient black-box input-level backdoor detection via analyzing scaled prediction consistency. In ICLR, 2023b.
  14. Spectre: Defending against backdoor attacks using robust statistics. In ICML, 2021.
  15. Deep Residual Learning for Image Recognition. In CVPR, 2016a.
  16. Identity Mappings in Deep Residual Networks. In ICCV, 2016b.
  17. Distilling cognitive backdoor patterns within an image. In ICLR, 2023.
  18. Backdoor Defense via Decoupling the Training Process. In ICLR, 2022.
  19. Defending Against Backdoor Attacks by Layer-wise Feature Analysis. In SIGKDD, 2023.
  20. Learning Multiple Layers of Features from Tiny Images. Technical report, 2009.
  21. Invisible Backdoor Attacks on Deep Neural Networks via Steganography and Regularization. TDSC, 2020.
  22. Invisible Backdoor Attack with Sample-Specific Triggers. In ICCV, 2021a.
  23. Anti-Backdoor Learning: Training Clean Models on Poisoned Data. In NeurIPS, 2021b.
  24. Backdoor Attack in the Physical World. In ICLR Workshop, 2021c.
  25. Backdoor Learning: A Survey. TNNLS, 2022.
  26. BackdoorBox: A Python Toolbox for Backdoor Learning. In ICLR Workshop, 2023.
  27. Fine-Pruning: Defending Against Backdooring Attacks on Deep Neural Networks. In RAID, 2018.
  28. Detecting Backdoors During the Inference Stage Based on Corruption Robustness Consistency. In CVPR, 2023.
  29. Learning gaussian mixtures with generalized linear models: Precise asymptotics in high-dimensions. In NeurIPS, 2021.
  30. The” beatrix”resurrections: Robust backdoor detection via gram matrices. In NDSS, 2022.
  31. Input-Aware Dynamic Backdoor Attack. In NeurIPS, 2020.
  32. WaNet – Imperceptible Warping-based Backdoor Attack. In ICLR, 2021.
  33. Backdoor secrets unveiled: Identifying backdoor data with optimized scaled prediction consistency. In ICLR, 2024.
  34. {{\{{ASSET}}\}}: Robust backdoor data detection across a multiplicity of deep learning paradigms. In USENIX Security, 2023.
  35. Prevalence of neural collapse during the terminal phase of deep learning training. PNAS, 2020.
  36. Deep k-nn defense against clean-label data poisoning attacks. In ECCV, 2020.
  37. Towards Practical Deployment-Stage Backdoor Attack on Deep Neural Networks. In CVPR, 2022.
  38. Revisiting the Assumption of Latent Separability for Backdoor Defenses. In ICLR, 2023.
  39. Man vs. computer: Benchmarking machine learning algorithms for traffic sign recognition. Neural Networks, 2012.
  40. Demon in the Variant: Statistical Analysis of DNNs for Robust Backdoor Contamination Detection. In USENIX Security, 2021.
  41. An Embarrassingly Simple Approach for Trojan Attack in Deep Neural Networks. In SIGKDD, 2020.
  42. Setting the Trap: Capturing and Defeating Backdoor Threats in PLMs through Honeypots. In NeurIPS, 2023.
  43. Deep Learning and the Information Bottleneck Principle. In ITW, 2015.
  44. Spectral Signatures in Backdoor Attacks. In NeurIPS, 2018.
  45. Label-Consistent Backdoor Attacks. arXiv, 2019.
  46. Visualizing data using t-SNE. JMLR, 2008.
  47. Neural Cleanse: Identifying and Mitigating Backdoor Attacks in Neural Networks. In IEEE S&P, 2019.
  48. MM-BD: Post-Training Detection of Backdoor Attacks with Arbitrary Backdoor Pattern Types Using a Maximum Margin Statistic. In IEEE S&P, 2024.
  49. Training with More Confidence: Mitigating Injected and Natural Backdoors During Training. In NeurIPS, 2022a.
  50. BppAttack: Stealthy and Efficient Trojan Attacks against Deep Neural Networks via Image Quantization and Contrastive Adversarial Learning. In CVPR, 2022b.
  51. Backdoor Attacks Against Deep Learning Systems in the Physical World. In CVPR, 2021.
  52. Enhancing backdoor attacks with multi-level mmd regularization. TDSC, 2022.
  53. Umd: Unsupervised model detection for x2x backdoor attacks. In ICML, 2023.
  54. Batt: Backdoor attack with transformation-based triggers. In ICASSP, 2023.
  55. Rethinking the backdoor attacks’ triggers: A frequency perspective. In ICCV, 2021.
  56. Adversarial Unlearning of Backdoors via Implicit Hypergradient. In ICLR, 2022.
  57. Narcissus: A Practical Clean-Label Backdoor Attack with Limited Information. In CCS, 2023.
  58. Poison ink: Robust and invisible backdoor attack. TIP, 2022.
  59. Natural images, gaussian mixtures and dead leaves. In NeurIPS, 2012.
Citations (7)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com