Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

FineWAVE: Fine-Grained Warning Verification of Bugs for Automated Static Analysis Tools (2403.16032v2)

Published 24 Mar 2024 in cs.SE

Abstract: Automated Static Analysis Tools (ASATs) have evolved over time to assist in detecting bugs. However, the excessive false warnings can impede developers' productivity and confidence in the tools. Previous research efforts have explored learning-based methods to validate the reported warnings. Nevertheless, their coarse granularity, focusing on either long-term warnings or function-level alerts, which are insensitive to individual bugs. Also, they rely on manually crafted features or solely on source code semantics, which is inadequate for effective learning. In this paper, we propose FineWAVE, a learning-based approach that verifies bug-sensitive warnings at a fine-grained granularity. Specifically, we design a novel LSTM-based model that captures multi-modal semantics of source code and warnings from ASATs and highlights their correlations with cross-attention. To tackle the data scarcity of training and evaluation, we collected a large-scale dataset of 280,273 warnings. We conducted extensive experiments on the dataset to evaluate FineWAVE. The experimental results demonstrate the effectiveness of our approach, with an F1-score of 97.79\% for reducing false alarms and 67.06% for confirming actual warnings, significantly outperforming all baselines. Moreover, we have applied our FineWAVE to filter out about 92% warnings in four popular real-world projects, and found 25 new bugs with minimal manual effort.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (66)
  1. 2023. Soot - A framework for analyzing and transforming Java and Android applications. https://soot-oss.github.io/soot/ (Accessed on 01/12/2023).
  2. Building Useful Program Analysis Tools Using an Extensible Java Compiler. In 2012 IEEE 12th International Working Conference on Source Code Analysis and Manipulation. 14–23. https://doi.org/10.1109/SCAM.2012.28
  3. Unified pre-training for program understanding and generation. arXiv preprint arXiv:2103.06333 (2021).
  4. Self-supervised bug detection and repair. Advances in Neural Information Processing Systems 34 (2021), 27865–27876.
  5. Using Static Analysis to Find Bugs. IEEE Software 25, 5 (2008), 22–29. https://doi.org/10.1109/MS.2008.130
  6. Vipin Balachandran. 2013. Reducing human effort and improving quality in peer code reviews using automatic static analysis and reviewer recommendation. In 2013 35th International Conference on Software Engineering (ICSE). 931–940. https://doi.org/10.1109/ICSE.2013.6606642
  7. Learning a static analyzer from data. In Computer Aided Verification: 29th International Conference, CAV 2017, Heidelberg, Germany, July 24-28, 2017, Proceedings, Part I 30. Springer, 233–253.
  8. Class-based n-gram models of natural language. Computational linguistics 18, 4 (1992), 467–480.
  9. Moving Fast with Software Verification. In NASA Formal Methods, Klaus Havelund, Gerard Holzmann, and Rajeev Joshi (Eds.). Springer International Publishing, Cham, 3–11. https://doi.org/10.1007/978-3-319-17524-9_1
  10. Beware of the Unexpected: Bimodal Taint Analysis. In Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis (Seattle, WA, USA,) (ISSTA 2023). Association for Computing Machinery, New York, NY, USA, 211–222. https://doi.org/10.1145/3597926.3598050
  11. Christoph Csallner and Yannis Smaragdakis. 2005. Check’n’Crash: Combining static checking and testing. In Proceedings of the 27th international conference on Software engineering. 422–431.
  12. Towards Accurate Duplicate Bug Retrieval Using Deep Learning Techniques. In 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME). 115–124. https://doi.org/10.1109/ICSME.2017.69
  13. Codebert: A pre-trained model for programming and natural languages. arXiv preprint arXiv:2002.08155 (2020).
  14. Graphcodebert: Pre-training code representations with data flow. arXiv preprint arXiv:2009.08366 (2020).
  15. Lan-Zhe Guo and Yu-Feng Li. 2022. Class-Imbalanced Semi-Supervised Learning with Adaptive Thresholding. In Proceedings of the 39th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 162), Kamalika Chaudhuri, Stefanie Jegelka, Le Song, Csaba Szepesvari, Gang Niu, and Sivan Sabato (Eds.). PMLR, 8082–8094. https://proceedings.mlr.press/v162/guo22e.html
  16. A Comprehensive Study on Quality Assurance Tools for Java. In Proceedings of the 32st ACM SIGSOFT International Symposium on Software Testing and Analysis (Seattle, United States) (ISSTA 2023). Association for Computing Machinery, New York, NY, USA, 13 pages. https://doi.org/10.1145/3597926.3598056
  17. Finding Patterns in Static Analysis Alerts: Improving Actionable Alert Ranking. In Proceedings of the 11th Working Conference on Mining Software Repositories (Hyderabad, India) (MSR 2014). Association for Computing Machinery, New York, NY, USA, 152–161. https://doi.org/10.1145/2597073.2597100
  18. Ahmed E. Hassan. 2008. Automated Classification of Change Messages in Open Source Projects. In Proceedings of the 2008 ACM Symposium on Applied Computing (Fortaleza, Ceara, Brazil) (SAC ’08). Association for Computing Machinery, New York, NY, USA, 837–841. https://doi.org/10.1145/1363686.1363876
  19. Sarah Heckman and Laurie Williams. 2009. A Model Building Process for Identifying Actionable Static Analysis Alerts. In 2009 International Conference on Software Testing Verification and Validation. 161–170. https://doi.org/10.1109/ICST.2009.45
  20. Sarah Heckman and Laurie Williams. 2011. A systematic literature review of actionable alert identification techniques for automated static code analysis. Information and Software Technology 53, 4 (2011), 363–387. https://doi.org/10.1016/j.infsof.2010.12.007 Special section: Software Engineering track of the 24th Annual Symposium on Applied Computing.
  21. David Hovemeyer and William Pugh. 2004. Finding Bugs is Easy. SIGPLAN Not. 39, 12 (dec 2004), 92–106. https://doi.org/10.1145/1052883.1052895
  22. Why don’t software developers use static analysis tools to find bugs?. In 2013 35th International Conference on Software Engineering (ICSE). 672–681. https://doi.org/10.1109/ICSE.2013.6606613
  23. SMT-Based False Positive Elimination in Static Program Analysis. In Formal Methods and Software Engineering, Toshiaki Aoki and Kenji Taguchi (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 316–331.
  24. Detecting False Alarms from Automatic Static Analysis Tools: How Far Are We?. In Proceedings of the 44th International Conference on Software Engineering (Pittsburgh, Pennsylvania) (ICSE ’22). Association for Computing Machinery, New York, NY, USA, 698–709. https://doi.org/10.1145/3510003.3510214
  25. Learning to Reduce False Positives in Analytic Bug Detectors. In Proceedings of the 44th International Conference on Software Engineering (Pittsburgh, Pennsylvania) (ICSE ’22). Association for Computing Machinery, New York, NY, USA, 1307–1316. https://doi.org/10.1145/3510003.3510153
  26. Sunghun Kim and Michael D. Ernst. 2007a. Prioritizing Warning Categories by Analyzing Software History. In Fourth International Workshop on Mining Software Repositories (MSR’07:ICSE Workshops 2007). 27–27. https://doi.org/10.1109/MSR.2007.26
  27. Sunghun Kim and Michael D. Ernst. 2007b. Which Warnings Should I Fix First?. In Proceedings of the the 6th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on The Foundations of Software Engineering (Dubrovnik, Croatia) (ESEC-FSE ’07). Association for Computing Machinery, New York, NY, USA, 45–54. https://doi.org/10.1145/1287624.1287633
  28. Diederik P. Kingma and Jimmy Ba. 2017. Adam: A Method for Stochastic Optimization. arXiv:1412.6980 [cs.LG]
  29. An Empirical Assessment of Machine Learning Approaches for Triaging Reports of a Java Static Analysis Tool. In 2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST). 288–299. https://doi.org/10.1109/ICST.2019.00036
  30. Residual investigation: Predictive and precise bug detection. ACM Transactions on Software Engineering and Methodology (TOSEM) 24, 2 (2014), 1–32.
  31. Vuldeepecker: A deep learning-based system for vulnerability detection. arXiv preprint arXiv:1801.01681 (2018).
  32. Automatic Construction of an Effective Training Set for Prioritizing Static Analysis Warnings. In Proceedings of the 25th IEEE/ACM International Conference on Automated Software Engineering (Antwerp, Belgium) (ASE ’10). Association for Computing Machinery, New York, NY, USA, 93–102. https://doi.org/10.1145/1858996.1859013
  33. Focal Loss for Dense Object Detection. arXiv:1708.02002 [cs.CV]
  34. Evaluating and Integrating Diverse Bug Finders for Effective Program Analysis. In Software Analysis, Testing, and Evolution, Lei Bu and Yingfei Xiong (Eds.). Vol. 11293. Springer International Publishing, Cham, 51–67. https://doi.org/10.1007/978-3-030-04272-1_4 Series Title: Lecture Notes in Computer Science.
  35. Multiple program analysis techniques enable precise check for SEI CERT C coding standard. In 2019 26th Asia-Pacific Software Engineering Conference (APSEC). IEEE, 70–77.
  36. Reducing false positives of static analysis for sei cert c coding standard. In 2019 IEEE/ACM Joint 7th International Workshop on Conducting Empirical Studies in Industry (CESI) and 6th International Workshop on Software Engineering Research and Industrial Practice (SER&IP). IEEE, 41–48.
  37. Defect Identification, Categorization, and Repair: Better Together. arXiv:2204.04856 [cs.SE]
  38. Oracle. 2022. Oracle Java Documentation. https://docs.oracle.com/javase/tutorial/java/javaOO/variables.html. (Accessed on 01/12/2023).
  39. Would static analysis tools help developers with code reviews?. In 2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER). 161–170. https://doi.org/10.1109/SANER.2015.7081826
  40. Terence Parr and Sam Harwell. 2020. ANTLR 4. https://www.antlr.org/. (Accessed on 01/12/2023).
  41. Exploiting synthetically generated data with semi-supervised learning for small and imbalanced datasets. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 4715–4722.
  42. Chanathip Pornprasit and Chakkrit Kla Tantithamthavorn. 2022. Deeplinedp: Towards a deep learning approach for line-level defect prediction. IEEE Transactions on Software Engineering 49, 1 (2022), 84–98.
  43. Juan Ramos et al. 2003. Using tf-idf to determine word relevance in document queries. In Proceedings of the first instructional conference on machine learning, Vol. 242. Citeseer, 29–48.
  44. Xavier Rival. 2005a. Abstract dependences for alarm diagnosis. In Programming Languages and Systems: Third Asian Symposium, APLAS 2005, Tsukuba, Japan, November 2-5, 2005. Proceedings 3. Springer, 347–363.
  45. Xavier Rival. 2005b. Understanding the origin of alarms in Astrée. In Static Analysis: 12th International Symposium, SAS 2005, London, UK, September 7-9, 2005. Proceedings 12. Springer, 303–319.
  46. Predicting Accurate and Actionable Static Analysis Warnings: An Experimental Approach. In Proceedings of the 30th International Conference on Software Engineering (Leipzig, Germany) (ICSE ’08). Association for Computing Machinery, New York, NY, USA, 341–350. https://doi.org/10.1145/1368088.1368135
  47. Tricorder: Building a program analysis ecosystem. In 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering, Vol. 1. IEEE, 598–608.
  48. M. Schuster and K.K. Paliwal. 1997. Bidirectional recurrent neural networks. IEEE Transactions on Signal Processing 45, 11 (1997), 2673–2681. https://doi.org/10.1109/78.650093
  49. SonarSource. 2022. Sonarqube. https://www.sonarqube.org (Accessed on 01/12/2023).
  50. Spotbugs. 2022. Spotbugs. https://spotbugs.github.io (Accessed on 01/12/2023).
  51. David A. Tomassi. 2018. Bugs in the wild: examining the effectiveness of static analyzers at finding real-world bugs. In Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. ACM, Lake Buena Vista FL USA, 980–982. https://doi.org/10.1145/3236024.3275439
  52. Huy Tu and Tim Menzies. 2021. FRUGAL: Unlocking Semi-Supervised Learning for Software Analytics. In 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE). 394–406. https://doi.org/10.1109/ASE51524.2021.9678617
  53. Why and how JavaScript developers use linters. In 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE). 578–589. https://doi.org/10.1109/ASE.2017.8115668
  54. How developers engage with static analysis tools in different contexts. Empirical Software Engineering 25 (2020), 1419–1457.
  55. Attention is all you need. Advances in neural information processing systems 30 (2017).
  56. Is There a ”Golden” Feature Set for Static Warning Identification? An Experimental Evaluation. In Proceedings of the 12th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (Oulu, Finland) (ESEM ’18). Association for Computing Machinery, New York, NY, USA, Article 17, 10 pages. https://doi.org/10.1145/3239235.3239523
  57. Website of This Work. 2023. FineWAVE: Fine-Grained Warning Verification of Bugs for Automated Static Analysis Tools. https://sites.google.com/view/finewave/home (Accessed on 01/12/2023).
  58. C.C. Williams and J.K. Hollingsworth. 2005. Automatic mining of source code repositories to improve bug finding techniques. IEEE Transactions on Software Engineering 31, 6 (2005), 466–480. https://doi.org/10.1109/TSE.2005.63
  59. Peculiar: Smart contract vulnerability detection based on crucial data flow graph and pre-training techniques. In 2021 IEEE 32nd International Symposium on Software Reliability Engineering (ISSRE). IEEE, 378–389.
  60. Learning to Recognize Actionable Static Code Warnings (is Intrinsically Easy). Empirical Softw. Engg. 26, 3 (may 2021), 24 pages. https://doi.org/10.1007/s10664-021-09948-6
  61. Yuzhe Yang and Zhi Xu. 2020. Rethinking the Value of Labels for Improving Class-Imbalanced Learning. In Conference on Neural Information Processing Systems (NeurIPS).
  62. Ulas Yüksel and Hasan Sözer. 2013. Automated Classification of Static Code Analysis Alerts: A Case Study. In 2013 IEEE International Conference on Software Maintenance. 532–535. https://doi.org/10.1109/ICSM.2013.89
  63. Wojciech Zaremba and Ilya Sutskever. 2015. Learning to Execute. arXiv:1410.4615 [cs.NE]
  64. APICraft: Fuzz Driver Generation for Closed-source SDK Libraries. In 30th USENIX Security Symposium, USENIX Security 2021, August 11-13, 2021, Michael Bailey and Rachel Greenstadt (Eds.). USENIX Association, 2811–2828. https://www.usenix.org/conference/usenixsecurity21/presentation/zhang-cen
  65. A Novel Neural Source Code Representation Based on Abstract Syntax Tree. In 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE). 783–794. https://doi.org/10.1109/ICSE.2019.00086
  66. VulDeePecker: A Deep Learning-Based System for Multiclass Vulnerability Detection. IEEE Transactions on Dependable and Secure Computing 18, 5 (2021), 2224–2236. https://doi.org/10.1109/TDSC.2019.2942930
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (10)
  1. Han Liu (340 papers)
  2. Jian Zhang (542 papers)
  3. Cen Zhang (69 papers)
  4. Xiaohan Zhang (78 papers)
  5. Kaixuan Li (10 papers)
  6. Sen Chen (49 papers)
  7. Shang-Wei Lin (18 papers)
  8. Yixiang Chen (19 papers)
  9. Xinhua Li (5 papers)
  10. Yang Liu (2253 papers)