Papers
Topics
Authors
Recent
2000 character limit reached

Machine Learning needs Better Randomness Standards: Randomised Smoothing and PRNG-based attacks (2306.14043v2)

Published 24 Jun 2023 in cs.LG, cs.AI, and cs.CR

Abstract: Randomness supports many critical functions in the field of ML including optimisation, data selection, privacy, and security. ML systems outsource the task of generating or harvesting randomness to the compiler, the cloud service provider or elsewhere in the toolchain. Yet there is a long history of attackers exploiting poor randomness, or even creating it -- as when the NSA put backdoors in random number generators to break cryptography. In this paper we consider whether attackers can compromise an ML system using only the randomness on which they commonly rely. We focus our effort on Randomised Smoothing, a popular approach to train certifiably robust models, and to certify specific input datapoints of an arbitrary model. We choose Randomised Smoothing since it is used for both security and safety -- to counteract adversarial examples and quantify uncertainty respectively. Under the hood, it relies on sampling Gaussian noise to explore the volume around a data point to certify that a model is not vulnerable to adversarial examples. We demonstrate an entirely novel attack, where an attacker backdoors the supplied randomness to falsely certify either an overestimate or an underestimate of robustness for up to 81 times. We demonstrate that such attacks are possible, that they require very small changes to randomness to succeed, and that they are hard to detect. As an example, we hide an attack in the random number generator and show that the randomness tests suggested by NIST fail to detect it. We advocate updating the NIST guidelines on random number testing to make them more appropriate for safety-critical and security-critical machine-learning applications.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (69)
  1. IEEE standard for floating-point arithmetic. pages 1–84, 2019-07. Conference Name: IEEE Std 754-2019 (Revision of IEEE 754-2008).
  2. Towards more robust keyword spotting for voice assistants. In 31st USENIX Security Symposium (USENIX Security 22), pages 2655–2672, Boston, MA, Aug. 2022. USENIX Association.
  3. Privacy amplification by subsampling: Tight analyses via couplings and divergences. volume 31, 2018.
  4. E. Barker and J. Kelsey. Recommendation for random number generation using deterministic random bit generators. Number NIST Special Publication (SP) 800-90 (Withdrawn), 2006-06-13.
  5. Sp 800-22 rev. 1a. A statistical test suite for random and pseudorandom number generators for cryptographic applications. National Institute of Standards & Technology, 2010.
  6. Evasion attacks against machine learning at test time. In H. Blockeel, K. Kersting, S. Nijssen, and F. Železný, editors, Machine Learning and Knowledge Discovery in Databases, pages 387–402, Berlin, Heidelberg, 2013. Springer Berlin Heidelberg.
  7. Architectural backdoors in neural networks. In 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 24595–24604, Los Alamitos, CA, USA, jun 2023. IEEE Computer Society.
  8. Neighbors from hell: Voltage attacks against deep learning accelerators on multi-tenant fpgas. In 2020 International Conference on Field-Programmable Technology (ICFPT), pages 103–111. IEEE, 2020.
  9. N. Carlini and D. Wagner. Adversarial examples are not easily detected: Bypassing ten detection methods. In Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security, AISec ’17, pages 3–14. Association for Computing Machinery, 2017-11-03.
  10. Impnet: Imperceptible and blackbox-undetectable backdoors in compiled neural networks, 2022.
  11. The use of confidence or fiducial limits illustrated in the case of the binomial. volume 26, pages 404–413. [Oxford University Press, Biometrika Trust], 1934.
  12. Certified adversarial robustness via randomized smoothing. In K. Chaudhuri and R. Salakhutdinov, editors, Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, pages 1310–1320. PMLR, 09–15 Jun 2019.
  13. F. Croce and M. Hein. Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks. In ICML, 2020.
  14. Crypto-C. Rsa bsafe®. 2007.
  15. Exploiting certified defences to attack randomised smoothing, 2023-02-08.
  16. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pages 248–255. Ieee, 2009.
  17. C. Doty-Humphrey. Practrand: C++ library of pseudo-random number generators and statistical tests for rngs. 2014.
  18. Calibrating noise to sensitivity in private data analysis. In S. Halevi and T. Rabin, editors, Theory of Cryptography, Lecture Notes in Computer Science, pages 265–284. Springer, 2006.
  19. The algorithmic foundations of differential privacy. volume 9, pages 211–407. Now Publishers, Inc., 2014.
  20. Chapter 9: Generating randomness. In Cryptography Engineering: Design Principles and Practical Applications, pages 137–161. Wiley, 2010.
  21. Y. Gal and Z. Ghahramani. Dropout as a bayesian approximation: Representing model uncertainty in deep learning. In M. F. Balcan and K. Q. Weinberger, editors, Proceedings of The 33rd International Conference on Machine Learning, volume 48 of Proceedings of Machine Learning Research, pages 1050–1059, New York, New York, USA, 20–22 Jun 2016. PMLR.
  22. On the limitations of stochastic pre-processing defenses. In A. H. Oh, A. Agarwal, D. Belgrave, and K. Cho, editors, Advances in Neural Information Processing Systems, 2022.
  23. K. Gjøsteen. Comments on dual-ec-drbg/nist sp 800-90, draft december 2005. 2006.
  24. A fast jump ahead algorithm for linear recurrences in a polynomial space. In S. W. Golomb, M. G. Parker, A. Pott, and A. Winterhof, editors, Sequences and Their Applications - SETA 2008, Lecture Notes in Computer Science, pages 290–298. Springer, 2008.
  25. Array programming with NumPy. volume 585, pages 357–362, 2020-09. Publisher: Springer Science and Business Media LLC.
  26. Denoising diffusion probabilistic models. volume 33, pages 6840–6851, 2020.
  27. Proof-of-learning: Definitions and practice. In 2021 IEEE Symposium on Security and Privacy (SP), pages 1039–1056, 2021.
  28. Are we there yet? timing and floating-point attacks on differential privacy systems. In 2022 IEEE Symposium on Security and Privacy (SP), pages 473–488, 2022.
  29. Randomness and random sampling numbers. volume 101, pages 147–166. [Wiley, Royal Statistical Society], 1938.
  30. Flipping bits in memory without accessing them: An experimental study of dram disturbance errors. In 2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA), pages 361–372, 2014.
  31. Batchbald: Efficient and diverse batch acquisition for deep bayesian active learning, 2019.
  32. D. E. Knuth. The art of computer programming, 1997.
  33. D. E. Knuth. Chapter 3: Random numbers. In The Art of Computer Programming, volume 2, Seminumerical Algorithms, pages 1–177. Addison-Wesley, 3 edition, 1997.
  34. Federated learning: Strategies for improving communication efficiency. In NIPS Workshop on Private Multi-Party Machine Learning, 2016.
  35. Learning multiple layers of features from tiny images. 2009. Publisher: Toronto, ON, Canada.
  36. Harnessing large-language models to generate private synthetic text, 2023.
  37. Certified robustness to adversarial examples with differential privacy. In 2019 IEEE Symposium on Security and Privacy (SP), pages 656–672, 2019-05. ISSN: 2375-1207.
  38. D. H. Lehmer. Mathematical methods in large-scale computing units. pages 141–146, 1951.
  39. X. Leroy. Formal verification of a realistic compiler. volume 52, pages 107–115, 2009-07.
  40. Second-Order Adversarial Attack and Certifiable Robustness. 2018-09-09.
  41. On the benefits of invariance in neural networks. 2020.
  42. P. L’Ecuyer and F. Panneton. F2-linear random number generators. In C. Alexopoulos, D. Goldsman, and J. R. Wilson, editors, Advancing the Frontiers of Simulation: A Festschrift in Honor of George Samuel Fishman, International Series in Operations Research & Management Science, pages 169–193. Springer US, 2009.
  43. Quantization backdoors to deep learning commercial frameworks, 2023.
  44. G. Marsaglia and W. W. Tsang. The ziggurat method for generating random variables. volume 5, pages 1–7, 2000.
  45. A survey on bias and fairness in machine learning. volume 54, New York, NY, USA, jul 2021. Association for Computing Machinery.
  46. N. Metropolis and S. Ulam. The monte carlo method. volume 44, pages 335–341. Taylor & Francis, 1949.
  47. I. Mironov. On significance of the least significant bits for differential privacy. In Proceedings of the 2012 ACM conference on Computer and communications security, pages 650–661, 2012.
  48. IX. on the problem of the most efficient tests of statistical hypotheses. volume 231, pages 289–337, 1933-02. Publisher: Royal Society.
  49. M. E. O’neill. PCG: A family of simple fast space-efficient statistically good algorithms for random number generation. 2014.
  50. Barrage of random transforms for adversarially robust defense. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 6521–6530, 2019.
  51. Bit-flip attack: Crushing neural network with progressive bit search. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pages 1211–1220, 2019.
  52. Augmentation backdoors, 2022.
  53. Power comparisons of shapiro-wilk, kolmogorov-smirnov, lilliefors and anderson-darling tests. volume 2, pages 21–33, 2011.
  54. H. Robbins and S. Monro. A stochastic approximation method. pages 400–407. JSTOR, 1951.
  55. S. M. Ross. Probability Models for Computer Science. Harcourt Academic Press, 2022.
  56. P. Royston. Remark AS r94: A remark on algorithm AS 181: The w-test for normality. volume 44, pages 547–551, 1995. Publisher: JSTOR.
  57. Parallel random numbers: as easy as 1, 2, 3. In Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis, SC ’11, pages 1–12. Association for Computing Machinery, 2011-11-12.
  58. B. Schneier. Did NSA put a secret backdoor in new encryption standard? 2007-11-15. Section: tags.
  59. An analysis of variance test for normality (complete samples)†. volume 52, pages 591–611, 1965-12-01.
  60. Manipulating sgd with data ordering attacks. In M. Ranzato, A. Beygelzimer, Y. Dauphin, P. Liang, and J. W. Vaughan, editors, Advances in Neural Information Processing Systems, volume 34, pages 18021–18032. Curran Associates, Inc., 2021.
  61. Intriguing properties of neural networks, 2013-12-21.
  62. G. Taylor and G. Cox. Behind intel’s new random-number generator. volume 24, 2011-08-24.
  63. Bounding membership inference, 2022.
  64. T. van Erven and P. Harremos. Rényi divergence and kullback-leibler divergence. volume 60, pages 3797–3820, 2014-07. Conference Name: IEEE Transactions on Information Theory.
  65. SciPy 1.0: Fundamental algorithms for scientific computing in python. volume 17, pages 261–272, 2020.
  66. J. Von Neumann and others. Various techniques used in connection with random digits. volume 12, page 1, 1951.
  67. Automatically evading classifiers: A case study on PDF malware classifiers. In Proceedings 2016 Network and Distributed System Security Symposium. Internet Society, 2016.
  68. DiffSmooth: Certifiably robust learning via diffusion models and local smoothing. In 32nd USENIX Security Symposium (USENIX Security 23), pages 4787–4804, Anaheim, CA, Aug. 2023. USENIX Association.
  69. Randomness in neural network training: Characterizing the impact of tooling, 2021.

Summary

We haven't generated a summary for this paper yet.

Whiteboard

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.