Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
132 tokens/sec
GPT-4o
28 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

HardTaint: Production-Run Dynamic Taint Analysis via Selective Hardware Tracing (2402.17241v1)

Published 27 Feb 2024 in cs.CR, cs.PL, and cs.SE

Abstract: Dynamic taint analysis (DTA), as a fundamental analysis technique, is widely used in security, privacy, and diagnosis, etc. As DTA demands to collect and analyze massive taint data online, it suffers extremely high runtime overhead. Over the past decades, numerous attempts have been made to lower the overhead of DTA. Unfortunately, the reductions they achieved are marginal, causing DTA only applicable to the debugging/testing scenarios. In this paper, we propose and implement HardTaint, a system that can realize production-run dynamic taint tracking. HardTaint adopts a hybrid and systematic design which combines static analysis, selective hardware tracing and parallel graph processing techniques. The comprehensive evaluations demonstrate that HardTaint introduces only around 9% runtime overhead which is an order of magnitude lower than the state-of-the-arts, while without sacrificing any taint detection capability.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (69)
  1. libibverbs. https://github.com/linux-rdma/rdma-core/tree/master/libibverbs. Accessed: 2023-3-8.
  2. libipt: an Intel(R) Processor Trace decoder library. https://github.com/intel/libipt. Accessed: 2022-10-31.
  3. OpenBenchmarking. https://openbenchmarking.org/. Accessed: 2022-10-31.
  4. PHPBench: A benchmark runner for PHP. https://github.com/phpbench/phpbench.git. Accessed: 2023-3-8.
  5. Remote direct memory access. https://en.wikipedia.org/wiki/Remote_direct_memory_access. Accessed: 2023-11-16.
  6. simple-pt: a simple implementation of the Intel Processor Trace on Linux. https://github.com/andikleen/simple-pt. Accessed: 2022-10-31.
  7. Arm Embedded Trace Macrocell Architecture Specification ETMv4.0 to ETMv4.5, 2019.
  8. Quynh Nguyen Anh. Capstone: next generation disassembly framework. USA: BlackHat, 2014.
  9. Flowdroid: Precise context, flow, field, object-sensitive and lifecycle-aware taint analysis for android apps. Acm Sigplan Notices, 49(6):259–269, 2014.
  10. Automating configuration troubleshooting with dynamic information flow analysis. In 9th USENIX Symposium on Operating Systems Design and Implementation (OSDI 10), 2010.
  11. Wysinwyx: What you see is not what you execute. In Working Conference on Verified Software: Theories, Tools, and Experiments, pages 202–213. Springer, 2005.
  12. Iodine: fast dynamic taint tracking using rollback-free optimistic hybrid analysis. In 2019 IEEE Symposium on Security and Privacy (SP), pages 490–504. IEEE, 2019.
  13. Anywhere, any-time binary instrumentation. In Proceedings of the 10th ACM SIGPLAN-SIGSOFT workshop on Program analysis for software tools, pages 9–16, 2011.
  14. Minemu: The world’s fastest taint tracker. In International Workshop on Recent Advances in Intrusion Detection, pages 1–20. Springer, 2011.
  15. Interprocedural constant propagation. ACM SIGPLAN Notices, 21(7):152–161, 1986.
  16. SelectiveTaint: Efficient data flow tracking with static binary rewriting. In 30th USENIX Security Symposium (USENIX Security 21), pages 1665–1682, 2021.
  17. Log-based architectures for general-purpose monitoring of deployed code. In Proceedings of the 1st workshop on Architectural and system support for improving software dependability, pages 63–65, 2006.
  18. Flexible hardware acceleration for instruction-grain program monitoring. ACM SIGARCH Computer Architecture News, 36(3):377–388, 2008.
  19. Ptrix: Efficient hardware-assisted fuzzing for cots binary. In Proceedings of the 2019 ACM Asia Conference on Computer and Communications Security, pages 633–645, 2019.
  20. Dytan: a generic dynamic taint analysis framework. In Proceedings of the 2007 international symposium on Software testing and analysis, pages 196–206, 2007.
  21. A practical off-line taint analysis framework and its application in reverse engineering of file format. Computers & Security, 51:1–15, 2015.
  22. REPT: Reverse debugging of failures in deployed software. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18), pages 17–32, 2018.
  23. DECAF++: Elastic Whole-System dynamic taint analysis. In 22nd International Symposium on Research in Attacks, Intrusions and Defenses (RAID 2019), pages 31–45, 2019.
  24. Taintdroid: An information-flow tracking system for realtime privacy monitoring on smartphones. In Proceedings of the 9th USENIX Conference on Operating Systems Design and Implementation, OSDI’10, page 393–407, USA, 2010. USENIX Association.
  25. The taint rabbit: Optimizing generic taint analysis with dynamic fast path generation. In Proceedings of the 15th ACM Asia Conference on Computer and Communications Security, pages 622–636, 2020.
  26. Griffin: Guarding control flows using intel processor trace. ACM SIGPLAN Notices, 52(4):585–598, 2017.
  27. P/taint: Unified points-to and taint analysis. Proceedings of the ACM on Programming Languages, 1(OOPSLA):1–28, 2017.
  28. Pt-cfi: Transparent backward-edge control flow violation detection using intel processor trace. In Proceedings of the Seventh ACM on Conference on Data and Application Security and Privacy, pages 173–184, 2017.
  29. Rdma over commodity ethernet at scale. In Proceedings of the 2016 ACM SIGCOMM Conference, SIGCOMM ’16, page 202–215, New York, NY, USA, 2016. Association for Computing Machinery.
  30. Compilers—principles, techniques, and tools. 1986.
  31. Data center ethernet and remote direct memory access: Issues at hyperscale. Computer, 56(07):67–77, jul 2023.
  32. Enforcing unique code target property for control-flow integrity. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, pages 1470–1486, 2018.
  33. IBTA. Enabling the modern data center – rdma for the enterprise. https://www.roceinitiative.org/wp-content/uploads/2019/05/IBTA_WhitePaper_May-20-2019.pdf, 2019.
  34. Shadowreplica: efficient parallelization of dynamic data flow tracking. In Proceedings of the 2013 ACM SIGSAC conference on Computer & communications security, pages 235–246, 2013.
  35. A general approach for efficiently accelerating software-based dynamic data flow tracking on commodity hardware. In NDSS, 2012.
  36. Flowmatrix:gpu-assisted information-flow analysis through matrix-based representation. In 31st USENIX Security Symposium (USENIX Security 22), pages 2567–2584, 2022.
  37. Decoupling dynamic information flow tracking with a dedicated coprocessor. In 2009 IEEE/IFIP International Conference on Dependable Systems & Networks, pages 105–114. IEEE, 2009.
  38. libdft: Practical dynamic data flow tracking for commodity systems. In Proceedings of the 8th ACM SIGPLAN/SIGOPS conference on Virtual Execution Environments, pages 121–132, 2012.
  39. μ𝜇\muitalic_μafl: non-intrusive feedback-driven fuzzing for microcontroller firmware. In Proceedings of the 44th International Conference on Software Engineering, pages 1–12, 2022.
  40. David Chu Lin. Compiler support for predicated execution in superscalar processors. Master’s thesis, Citeseer, 1992.
  41. Straighttaint: Decoupled offline symbolic taint analysis. In 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE), pages 308–319. IEEE, 2016.
  42. TaintPipe: Pipelined symbolic taint analysis. In 24th USENIX Security Symposium (USENIX Security 15), pages 65–80, 2015.
  43. Dynamic taint analysis for automatic detection, analysis, and signaturegeneration of exploits on commodity software. In NDSS, volume 5, pages 3–4, 2005.
  44. Parallelizing security checks on commodity hardware. ACM SIGARCH Computer Architecture News, 36(1):308–318, 2008.
  45. Mirrortaint: Practical non-intrusive dynamic taint tracking for jvm-based microservice systems. In 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE), pages 2514–2526. IEEE, 2023.
  46. Efficient run-time monitoring using shadow processing. In AADEBUG, volume 95, pages 1–14, 1995.
  47. Tainthls: High-level synthesis for dynamic information flow tracking. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 38(5):798–808, 2018.
  48. Lift: A low-overhead practical information flow tracking system for detecting security attacks. In 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’06), pages 135–148. IEEE, 2006.
  49. JetStream: Cluster-Scale parallelization of information flow queries. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16), pages 451–466, 2016.
  50. Google-wide profiling: A continuous profiling infrastructure for data centers. IEEE Micro, 30(4):65–79, jul 2010.
  51. Parallelizing dynamic information flow tracking. In Proceedings of the twentieth annual symposium on Parallelism in algorithms and architectures, pages 35–45, 2008.
  52. Airtaint: Making dynamic taint analysis faster and easier. In 2024 IEEE Symposium on Security and Privacy (SP), pages 45–45. IEEE Computer Society, 2023.
  53. {{\{{kAFL}}\}}:{{\{{Hardware-Assisted}}\}} feedback fuzzing for {{\{{OS}}\}} kernels. In 26th USENIX security symposium (USENIX Security 17), pages 167–182, 2017.
  54. Automatic reverse engineering of malware emulators. In 2009 30th IEEE Symposium on Security and Privacy, pages 94–109. IEEE, 2009.
  55. Hardware-assisted instruction profiling and latency detection. The Journal of Engineering, 2016(10):367–376, 2016.
  56. Sok:(state of) the art of war: Offensive techniques in binary analysis. In 2016 IEEE Symposium on Security and Privacy (SP), pages 138–157. IEEE, 2016.
  57. F4f: taint analysis of framework-based web applications. In Proceedings of the 2011 ACM international conference on Object oriented programming systems languages and applications, pages 1053–1068, 2011.
  58. Secure program execution via dynamic information flow tracking. ACM Sigplan Notices, 39(11):85–96, 2004.
  59. Taj: effective taint analysis of web applications. ACM Sigplan Notices, 44(6):87–97, 2009.
  60. Ramblr: Making reassembly great again. In NDSS, 2017.
  61. Taintscope: A checksum-aware directed fuzzing tool for automatic software vulnerability detection. In 2010 IEEE Symposium on Security and Privacy, pages 497–512. IEEE, 2010.
  62. Datacenter-scale analysis and optimization of gpu machine learning workloads. IEEE Micro, 41(5):101–112, 2021.
  63. Processor tracing for virtual machines. In The Workshop on Modern Language Runtimes, Ecosystems, and VMs (MoreVMs), 2017.
  64. ARCUS: Symbolic root cause analysis of exploits in production systems. In 30th USENIX Security Symposium (USENIX Security 21), pages 1989–2006, 2021.
  65. Taintstream: fine-grained taint tracking for big data platforms through dynamic code translation. In Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pages 806–817, 2021.
  66. Panorama: capturing system-wide information flow for malware detection and analysis. In Proceedings of the 14th ACM conference on Computer and communications security, pages 116–127, 2007.
  67. Prorace: Practical data race detection for production use. In Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS ’17, page 149–162, New York, NY, USA, 2017. Association for Computing Machinery.
  68. Alligator in vest: A practical failure-diagnosis framework via arm hardware features. In Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis, ISSTA 2023, page 917–928, New York, NY, USA, 2023.
  69. Tainteraser: Protecting sensitive data leaks using application-level taint tracking. ACM SIGOPS Operating Systems Review, 45(1):142–154, 2011.

Summary

We haven't generated a summary for this paper yet.