HardRace: A Dynamic Data Race Monitor for Production Use
Abstract: Data races are critical issues in multithreaded program, leading to unpredictable, catastrophic and difficult-to-diagnose problems. Despite the extensive in-house testing, data races often escape to deployed software and manifest in production runs. Existing approaches suffer from either prohibitively high runtime overhead or incomplete detection capability. In this paper, we introduce HardRace, a data race monitor to detect races on-the-fly while with sufficiently low runtime overhead and high detection capability. HardRace firstly employs sound static analysis to determine a minimal set of essential memory accesses relevant to data races. It then leverages hardware trace instruction, i.e., Intel PTWRITE, to selectively record only these memory accesses and thread synchronization events during execution with negligible runtime overhead. Given the tracing data, HardRace performs standard data race detection algorithms to timely report potential races occurred in production runs. The experimental evaluations show that HardRace outperforms state-of-the-art tools like ProRace and Kard in terms of both runtime overhead and detection capability -- HardRace can detect all kinds of data races in read-world applications while maintaining a negligible overhead, less than 2% on average.
- Kard: lightweight data race detection with per-thread memory protection. In Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (Virtual, USA) (ASPLOS ’21). Association for Computing Machinery, New York, NY, USA, 647–660. https://doi.org/10.1145/3445814.3446727
- Quynh Nguyen Anh. 2014. Capstone: Next generation disassembly framework. Proceedings of the 2014 Black Hat USA, Black Hat USA 14 (2014).
- Gogul Balakrishnan and Thomas Reps. 2010. WYSINWYX: What you see is not what you eXecute. ACM Trans. Program. Lang. Syst. 32, 6, Article 23 (aug 2010), 84Â pages. https://doi.org/10.1145/1749608.1749612
- Andrew R. Bernat and Barton P. Miller. 2011. Anywhere, any-time binary instrumentation. In Proceedings of the 10th ACM SIGPLAN-SIGSOFT Workshop on Program Analysis for Software Tools (Szeged, Hungary) (PASTE ’11). Association for Computing Machinery, New York, NY, USA, 9–16. https://doi.org/10.1145/2024569.2024572
- RacerD: compositional static race detection. Proc. ACM Program. Lang. 2, OOPSLA, Article 144 (oct 2018), 28Â pages. https://doi.org/10.1145/3276514
- PACER: proportional detection of data races. In Proceedings of the 31st ACM SIGPLAN Conference on Programming Language Design and Implementation (Toronto, Ontario, Canada) (PLDI ’10). Association for Computing Machinery, New York, NY, USA, 255–268. https://doi.org/10.1145/1806596.1806626
- HerQules: securing programs via hardware-enforced message queues. In Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (Virtual, USA) (ASPLOS ’21). Association for Computing Machinery, New York, NY, USA, 773–788. https://doi.org/10.1145/3445814.3446736
- SelectiveTaint: Efficient Data Flow Tracking With Static Binary Rewriting. In 30th USENIX Security Symposium (USENIX Security 21). USENIX Association, 1665–1682. https://www.usenix.org/conference/usenixsecurity21/presentation/chen-sanchuan
- Intel Corporation. [n. d.]. libipt: an Intel(R) Processor Trace decoder library. https://github.com/intel/libipt Accessed: 2024.
- Cormac Flanagan and Stephen N. Freund. 2009. FastTrack: efficient and precise dynamic race detection. In Proceedings of the 30th ACM SIGPLAN Conference on Programming Language Design and Implementation (Dublin, Ireland) (PLDI ’09). Association for Computing Machinery, New York, NY, USA, 121–133. https://doi.org/10.1145/1542476.1542490
- HAccRG: Hardware-Accelerated Data Race Detection in GPUs. In Proceedings of the 2013 42nd International Conference on Parallel Processing (ICPP ’13). IEEE Computer Society, USA, 60–69. https://doi.org/10.1109/ICPP.2013.15
- CLAP: recording local executions to reproduce concurrency failures. In Proceedings of the 34th ACM SIGPLAN Conference on Programming Language Design and Implementation (Seattle, Washington, USA) (PLDI ’13). Association for Computing Machinery, New York, NY, USA, 141–152. https://doi.org/10.1145/2491956.2462167
- Lazy Diagnosis of In-Production Concurrency Bugs. In Proceedings of the 26th Symposium on Operating Systems Principles (Shanghai, China) (SOSP ’17). Association for Computing Machinery, New York, NY, USA, 582–598. https://doi.org/10.1145/3132747.3132767
- RaceMob: crowdsourced data race detection. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles (Farminton, Pennsylvania) (SOSP ’13). Association for Computing Machinery, New York, NY, USA, 406–422. https://doi.org/10.1145/2517349.2522736
- Dynamic race prediction in linear time. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation (Barcelona, Spain) (PLDI 2017). Association for Computing Machinery, New York, NY, USA, 157–170. https://doi.org/10.1145/3062341.3062374
- Leslie Lamport. 1978. Time, clocks, and the ordering of events in a distributed system. Commun. ACM 21, 7 (July 1978), 558–565. https://doi.org/10.1145/359545.359563
- A Value Set Analysis Refinement Approach Based on Conditional Merging and Lazy Constraint Solving. IEEE Access 7 (2019), 114593–114606. https://doi.org/10.1109/ACCESS.2019.2936139
- When threads meet events: efficient and precise static race detection with origins. In Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation (Virtual, Canada) (PLDI 2021). Association for Computing Machinery, New York, NY, USA, 725–739. https://doi.org/10.1145/3453483.3454073
- Learning from mistakes: a comprehensive study on real world concurrency bug characteristics. SIGOPS Oper. Syst. Rev. 42, 2 (mar 2008), 329–339. https://doi.org/10.1145/1353535.1346323
- Learning from mistakes: a comprehensive study on real world concurrency bug characteristics. In Proceedings of the 13th International Conference on Architectural Support for Programming Languages and Operating Systems (Seattle, WA, USA) (ASPLOS XIII). Association for Computing Machinery, New York, NY, USA, 329–339. https://doi.org/10.1145/1346281.1346323
- Finding and reproducing Heisenbugs in concurrent programs. In Proceedings of the 8th USENIX Conference on Operating Systems Design and Implementation (San Diego, California) (OSDI’08). USENIX Association, USA, 267–280.
- Robert O’Callahan and Jong-Deok Choi. 2003. Hybrid dynamic data race detection. In Proceedings of the Ninth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (San Diego, California, USA) (PPoPP ’03). Association for Computing Machinery, New York, NY, USA, 167–178. https://doi.org/10.1145/781498.781528
- Eraser: a dynamic data race detector for multithreaded programs. ACM Trans. Comput. Syst. 15, 4 (nov 1997), 391–411. https://doi.org/10.1145/265924.265927
- Konstantin Serebryany and Timur Iskhodzhanov. 2009. ThreadSanitizer: data race detection in practice. In Proceedings of the Workshop on Binary Instrumentation and Applications (New York, New York, USA) (WBIA ’09). Association for Computing Machinery, New York, NY, USA, 62–71. https://doi.org/10.1145/1791194.1791203
- RACEZ: a lightweight and non-invasive race detection tool for production applications. In 2011 33rd International Conference on Software Engineering (ICSE). 401–410. https://doi.org/10.1145/1985793.1985848
- SOK: (State of) The Art of War: Offensive Techniques in Binary Analysis. In 2016 IEEE Symposium on Security and Privacy (SP). 138–157. https://doi.org/10.1109/SP.2016.17
- Sound predictive race detection in polynomial time. In Proceedings of the 39th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (Philadelphia, PA, USA) (POPL ’12). Association for Computing Machinery, New York, NY, USA, 387–400. https://doi.org/10.1145/2103656.2103702
- RELAY: static race detection on millions of lines of code. In Proceedings of the the 6th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on The Foundations of Software Engineering (Dubrovnik, Croatia) (ESEC-FSE ’07). Association for Computing Machinery, New York, NY, USA, 205–214. https://doi.org/10.1145/1287624.1287654
- Jie Yu and Satish Narayanasamy. 2009. A case for an interleaving constrained shared-memory multi-processor. In Proceedings of the 36th Annual International Symposium on Computer Architecture (Austin, TX, USA) (ISCA ’09). Association for Computing Machinery, New York, NY, USA, 325–336. https://doi.org/10.1145/1555754.1555796
- ProRace: Practical Data Race Detection for Production Use. In Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems (Xi’an, China) (ASPLOS ’17). Association for Computing Machinery, New York, NY, USA, 149–162. https://doi.org/10.1145/3037697.3037708
- TxRace: Efficient Data Race Detection Using Commodity Hardware Transactional Memory. In Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems (Atlanta, Georgia, USA) (ASPLOS ’16). Association for Computing Machinery, New York, NY, USA, 159–173. https://doi.org/10.1145/2872362.2872384
- JPortal: precise and efficient control-flow tracing for JVM programs with Intel processor trace. In Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation (Virtual, Canada) (PLDI 2021). Association for Computing Machinery, New York, NY, USA, 1080–1094. https://doi.org/10.1145/3453483.3454096
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.