Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
162 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Observation-based unit test generation at Meta (2402.06111v1)

Published 9 Feb 2024 in cs.SE

Abstract: TestGen automatically generates unit tests, carved from serialized observations of complex objects, observed during app execution. We describe the development and deployment of TestGen at Meta. In particular, we focus on the scalability challenges overcome during development in order to deploy observation-based test carving at scale in industry. So far, TestGen has landed 518 tests into production, which have been executed 9,617,349 times in continuous integration, finding 5,702 faults. Meta is currently in the process of more widespread deployment. Our evaluation reveals that, when carving its observations from 4,361 reliable end-to-end tests, TestGen was able to generate tests for at least 86\% of the classes covered by end-to-end tests. Testing on 16 Kotlin Instagram app-launch-blocking tasks demonstrated that the TestGen tests would have trapped 13 of these before they became launch blocking.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (44)
  1. Testing Web Enabled Simulation at Scale Using Metamorphic Testing. In International Conference on Software Engineering (ICSE) Software Engineering in Practice (SEIP) track. Virtual.
  2. Facebook’s Cyber–Cyber and Cyber–Physical Digital Twins (keynote paper). In 25th International Conference on Evaluation and Assessment in Software Engineering (EASE 2021). Virtual. Keynote talk given jointly by Inna Dvortsova and Mark Harman.
  3. Automated Unit Test Improvement using Large Language Models at Meta. In Foundations of Software Engineering (FSE 2024). Submitted.
  4. Deploying Search Based Software Engineering with Sapienz at Facebook (keynote paper). In 10t⁢hsuperscript10𝑡ℎ10^{th}10 start_POSTSUPERSCRIPT italic_t italic_h end_POSTSUPERSCRIPT International Symposium on Search Based Software Engineering (SSBSE 2018). Montpellier, France, 3–45. Springer LNCS 11036.
  5. Software Testing Research Challenges: An Industrial Perspective. In 2023 IEEE Conference on Software Testing, Verification and Validation (ICST 2023). IEEE, 1–10.
  6. Assured LLM-Based Software Engineering (keynote paper). In 2n⁢d.superscript2𝑛𝑑2^{nd.}2 start_POSTSUPERSCRIPT italic_n italic_d . end_POSTSUPERSCRIPT ICSE workshop on Interoperability and Robustness of Neural Software Engineering (InteNSE) (Lisbon, Portugal). To appear.
  7. AUTOMOCK: automated synthesis of a mock environment for test case generation. In Dagstuhl Seminar Proceedings. Schloss Dagstuhl-Leibniz-Zentrum für Informatik.
  8. An orchestrated survey of methodologies for automated software test case generation. Journal of Systems and Software 86, 8 (August 2013), 1978–2001.
  9. The Oracle Problem in Software Testing: A Survey. IEEE Transactions on Software Engineering 41, 5 (May 2015), 507–525.
  10. ORBS: Language-Independent Program Slicing. In 22n⁢dsuperscript22𝑛𝑑22^{nd}22 start_POSTSUPERSCRIPT italic_n italic_d end_POSTSUPERSCRIPT ACM SIGSOFT International Symposium on the Foundations of Software Engineering (FSE 2014). Hong Kong, China, 109–120.
  11. MeMo: Automatically identifying metamorphic relations in Javadoc comments for test automation. Journal of Systems and Software 181 (2021), 111041.
  12. Kinga Bojarczuk and Mark Harman. 2022. Testing of and with cyber-cyber digital twins. In 7t⁢hsuperscript7𝑡ℎ7^{th}7 start_POSTSUPERSCRIPT italic_t italic_h end_POSTSUPERSCRIPT International workshop on metamorphic testing (MET 2022). Pittsburgh, PA, USA. Keynote talk given jointly by Kinga Bojarczuk and Mark Harman.
  13. SELECT – a Formal System for Testing and Debugging Programs by Symbolic Execution. In International Conference on Reliable Software (Los Angeles, California). ACM, New York, NY, USA, 234–245.
  14. Cristian Cadar. 2015. Targeted Program Transformations for Symbolic Execution. In 10th Joint Meeting on Foundations of Software Engineering (ESEC/FSE) (Bergamo, Italy). 906–909.
  15. Cristian Cadar and Koushik Sen. 2013. Symbolic Execution for Software Testing: Three Decades Later. Commun. ACM 56, 2 (Feb. 2013), 82–90.
  16. Open-sourcing Facebook Infer: Identify bugs before you ship. ([n. d.]). code.facebook.com blog post, 11 June 2015.
  17. Metamorphic testing: a review of challenges and opportunities. ACM Computing Surveys (CSUR) 51, 1 (January 2018), 4:1–4:27.
  18. Alastair F. Donaldson. 2019. Metamorphic testing of Android graphics drivers. In Proceedings of the 4th International Workshop on Metamorphic Testing, MET@ICSE 2019, Montreal, QC, Canada, May 26, 2019, Xiaoyuan Xie, Pak-Lok Poon, and Laura L. Pullum (Eds.). IEEE / ACM, 1.
  19. Inna Dvortsova and Mark Harman. 2022. Automated Testing as Production Simulation: Research Opportunities and Challenges. In 37t⁢hsuperscript37𝑡ℎ37^{th}37 start_POSTSUPERSCRIPT italic_t italic_h end_POSTSUPERSCRIPT IEEE/ACM International Conference on Automated Software Engineering (ASE 2022. Michigan, USA. Keynote talk given jointly by Inna Dvortsova and Mark Harman.
  20. Carving and Replaying Differential Unit Test Cases from System Test Cases. IEEE Transactions on Software Engineering 35, 1 (2009), 29–45.
  21. Dynamically Discovering Likely Program Invariants to Support Program Evolution. IEEE Transactions on Software Engineering 27, 2 (Feb. 2001), 1–25.
  22. Use of test doubles in Android testing: An in-depth investigation. In Proceedings of the 44th International Conference on Software Engineering. 2266–2278.
  23. Dunwei Gong and Xiangjuan Yao. 2012. Testability transformation based on equivalence of target statements. Neural Computing and Applications 21, 8 (2012), 1871–1882.
  24. Search-based system testing: high coverage, no false alarms. In International Symposium on Software Testing and Analysis (ISSTA 2012). 67–77.
  25. Mark Harman. 2011. Making the Case for MORTO: Multi Objective Regression Test Optimization (invited position paper). In 1s⁢tsuperscript1𝑠𝑡1^{st}1 start_POSTSUPERSCRIPT italic_s italic_t end_POSTSUPERSCRIPT International Workshop on Regression Testing (Regression 2011). Berlin, Germany.
  26. Mark Harman. 2022. Scaling Genetic Improvement and Automated Program Repair (keynote paper). In 3rd IEEE/ACM International Workshop on Automated Program Repair, APR@ICSE 2022, Pittsburgh, PA, USA, May 19, 2022. IEEE, 1–7. https://doi.org/10.1145/3524459.3527353
  27. Testability Transformation. IEEE Transactions on Software Engineering 30, 1 (Jan. 2004), 3–16.
  28. Achievements, open problems and challenges for search based software testing (keynote Paper). In 8t⁢hsuperscript8𝑡ℎ8^{th}8 start_POSTSUPERSCRIPT italic_t italic_h end_POSTSUPERSCRIPT IEEE International Conference on Software Testing, Verification and Validation (ICST 2015). Graz, Austria.
  29. OCAT: object capture-based automated testing. In Proceedings of the 19th international symposium on Software testing and analysis. 159–170.
  30. Alexander Kampmann and Andreas Zeller. 2019. Carving parameterized unit tests. In 2019 IEEE/ACM 41st International Conference on Software Engineering: Companion Proceedings (ICSE-Companion). IEEE, 248–249.
  31. The Art, Science, and Engineering of Fuzzing: A Survey. CoRR abs/1812.00140 (2018). arXiv:1812.00140 http://arxiv.org/abs/1812.00140
  32. Sapienz: Multi-objective Automated Testing for Android Applications. In International Symposium on Software Testing and Analysis (ISSTA 2016). 94–105.
  33. Phil McMinn. 2004. Search-based Software Test Data Generation: A Survey. Software Testing, Verification and Reliability 14, 2 (June 2004), 105–156.
  34. Carlos Pacheco and Michael D Ernst. 2007. Randoop: feedback-directed random testing for Java. In Companion to the 22nd ACM SIGPLAN conference on Object-oriented programming systems and applications companion. 815–816.
  35. David Saff and Michael D Ernst. 2004. Mock object creation for test factoring. In Proceedings of the 5th ACM SIGPLAN-SIGSOFT workshop on Program analysis for software tools and engineering. 49–51.
  36. A Survey on Metamorphic Testing. IEEE Transactions on Software Engineering 42, 9 (2016), 805–824.
  37. To mock or not to mock? An empirical study on mocking practices. In 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR). IEEE, 402–412.
  38. jRapture: A capture/replay tool for observation-based testing. In Proceedings of the 2000 ACM SIGSOFT international symposium on Software Testing and Analysis (ISSTA 2000). 158–167.
  39. Dynamic inference of likely metamorphic properties to support differential testing. In 2015 IEEE/ACM 10th International Workshop on Automation of Software Test. IEEE, 55–59.
  40. Simulation-Driven Automated End-to-End Test and Oracle Inference. In 45th IEEE/ACM International Conference on Software Engineering: Software Engineering in Practice, SEIP@ICSE 2023, Melbourne, Australia, May 14-20, 2023. IEEE, 122–133.
  41. Observation based creation of minimal test suites for autonomous vehicles. In 2017 IEEE International symposium on software reliability engineering workshops (ISSREW). IEEE, 294–301.
  42. Carving UI Tests to Generate API Tests and API Specification. arXiv preprint arXiv:2305.14692 (2023).
  43. Shin Yoo and Mark Harman. 2012. Regression Testing Minimisation, Selection and Prioritisation: A Survey. Journal of Software Testing, Verification and Reliability 22, 2 (2012), 67–120.
  44. Search-based inference of polynomial metamorphic relations. In ACM/IEEE International Conference on Automated Software Engineering (ASE’14), Ivica Crnkovic, Marsha Chechik, and Paul Gruenbacher (Eds.). Vasteras, Sweden, 701–712. https://doi.org/doi:10.1145/2642937.2642994
Citations (5)

Summary

We haven't generated a summary for this paper yet.