Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
126 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Python Fuzzing for Trustworthy Machine Learning Frameworks (2403.12723v2)

Published 19 Mar 2024 in cs.CR, cs.AI, and cs.SE

Abstract: Ensuring the security and reliability of machine learning frameworks is crucial for building trustworthy AI-based systems. Fuzzing, a popular technique in secure software development lifecycle (SSDLC), can be used to develop secure and robust software. Popular machine learning frameworks such as PyTorch and TensorFlow are complex and written in multiple programming languages including C/C++ and Python. We propose a dynamic analysis pipeline for Python projects using the Sydr-Fuzz toolset. Our pipeline includes fuzzing, corpus minimization, crash triaging, and coverage collection. Crash triaging and severity estimation are important steps to ensure that the most critical vulnerabilities are addressed promptly. Furthermore, the proposed pipeline is integrated in GitLab CI. To identify the most vulnerable parts of the machine learning frameworks, we analyze their potential attack surfaces and develop fuzz targets for PyTorch, TensorFlow, and related projects such as h5py. Applying our dynamic analysis pipeline to these targets, we were able to discover 3 new bugs and propose fixes for them.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (26)
  1. Coverage.py: A tool for measuring code coverage of Python programs, https://coverage.readthedocs.io
  2. Fix of endless loop error in TensorFlow, https://github.com/tensorflow/tensorflow/pull/56455/files
  3. Fix out of bounds in hdf5/src/h5fint.c:2859, https://github.com/HDFGroup/hdf5/pull/2691
  4. FuzzedDataProvider, https://github.com/google/fuzzing/blob/master/docs/split-inputs.md#fuzzed-data-provider
  5. genhtml: Generate html view from lcov coverage data files, https://linux.die.net/man/1/genhtml
  6. Google sanitizers, https://github.com/google/sanitizers
  7. h5py: HDF5 for Python, https://github.com/h5py/h5py
  8. HDF5 project, https://github.com/HDFGroup/hdf5
  9. Hypothesis library, https://hypothesis.works/
  10. Null pointer dereference in third_party/flatbuffers/include/flatbuffers/vector.h:158:48, https://github.com/pytorch/pytorch/issues/95061
  11. OSS-Fuzz: Continuous fuzzing for open source software, https://github.com/google/oss-fuzz
  12. OSS-Sydr-Fuzz h5py project, https://github.com/ispras/oss-sydr-fuzz/tree/master/projects/h5py
  13. OSS-Sydr-Fuzz: Hybrid fuzzing for open source software, https://github.com/ispras/oss-sydr-fuzz
  14. OSS-Sydr-Fuzz PyTorch project, https://github.com/ispras/oss-sydr-fuzz/tree/master/projects/pytorch-py
  15. OSS-Sydr-Fuzz TensorFlow project, https://github.com/ispras/oss-sydr-fuzz/tree/master/projects/tensorflow-py
  16. Out of bounds access on read in hdf5/src/h5fint.c:2859:13, https://github.com/HDFGroup/hdf5/issues/2432
  17. PyTorch project, https://github.com/pytorch/pytorch
  18. Segmentation fault in flatbuffers when parsing malformed modules, https://github.com/pytorch/pytorch/pull/95221
  19. SEGV in flatbuffers/base.h:406:23, https://github.com/pytorch/pytorch/issues/95062
  20. Sydr-Fuzz trophies, https://github.com/ispras/oss-sydr-fuzz/blob/master/TROPHIES.md
  21. TensorFlow: An open source machine learning framework for everyone, https://github.com/tensorflow/tensorflow
  22. TensorFlow Keras module, https://www.tensorflow.org/api_docs/python/tf/keras?version=nightly
  23. torchvision project, https://github.com/pytorch/vision
  24. Using instrumentation with Atheris and native extensions, https://github.com/google/atheris/blob/master/native_extension_fuzzing.md
  25. Serebryany, K.: Continuous fuzzing with libFuzzer and AddressSanitizer. In: 2016 IEEE Cybersecurity Development (SecDev). p. 157. IEEE (2016)
  26. Serebryany, K.: OSS-Fuzz - Google’s continuous fuzzing service for open source software. USENIX Association (2017)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com