GPT-4 and Safety Case Generation: An Exploratory Analysis (2312.05696v1)
Abstract: In the ever-evolving landscape of software engineering, the emergence of LLMs and conversational interfaces, exemplified by ChatGPT, is nothing short of revolutionary. While their potential is undeniable across various domains, this paper sets out on a captivating expedition to investigate their uncharted territory, the exploration of generating safety cases. In this paper, our primary objective is to delve into the existing knowledge base of GPT-4, focusing specifically on its understanding of the Goal Structuring Notation (GSN), a well-established notation allowing to visually represent safety cases. Subsequently, we perform four distinct experiments with GPT-4. These experiments are designed to assess its capacity for generating safety cases within a defined system and application domain. To measure the performance of GPT-4 in this context, we compare the results it generates with ground-truth safety cases created for an X-ray system system and a Machine-Learning (ML)-enabled component for tire noise recognition (TNR) in a vehicle. This allowed us to gain valuable insights into the model's generative capabilities. Our findings indicate that GPT-4 demonstrates the capacity to produce safety arguments that are moderately accurate and reasonable. Furthermore, it exhibits the capability to generate safety cases that closely align with the semantic content of the reference safety cases used as ground-truths in our experiments.
- Using Assurance Cases to Prevent Malicious Behaviour from Targeting Safety Vulnerabilities. In International Conference on Computer Safety, Reliability, and Security. Springer, 5–14.
- Position paper: a vision for the dynamic safety assurance of ML-enabled autonomous driving systems. In 2023 IEEE 31st International Requirements Engineering Conference Workshops (REW). IEEE, 297–301.
- Safety assurance of machine learning for chassis control functions. In Computer Safety, Reliability, and Security: 40th International Conference, SAFECOMP 2021, York, UK, September 8–10, 2021, Proceedings 40. Springer, 149–162.
- On the assessment of generative AI in modeling tasks: an experience report with ChatGPT and UML. Software and Systems Modeling (2023), 1–13.
- Towards using few-shot prompt learning for automating model completion. In 2023 IEEE/ACM 45th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER). IEEE, 7–12.
- Graphical safety assurance case using Goal Structuring Notation (GSN)—challenges, opportunities and a framework for autonomous trains. Reliability Engineering & System Safety 230 (2023), 108933.
- On the use of GPT-4 for creating goal models: an exploratory study. In 2023 IEEE 31st International Requirements Engineering Conference Workshops (REW). IEEE, 262–271.
- Model-based safety engineering for autonomous train map. Journal of Systems and Software 183 (2022), 111082.
- An industrial survey of safety evidence change impact analysis practice. IEEE TSE 42, 12 (2016), 1095–1117.
- Ewen Denney and Ganesh Pai. 2014. Automating the assembly of aviation safety cases. IEEE Transactions on Reliability 63, 4 (2014), 830–849.
- Representation of confidence in assurance cases using the beta distribution. In 2016 IEEE 17th International Symposium on High Assurance Systems Engineering (HASE). IEEE, 86–93.
- The Assurance Case Working Group. 2021. Goal Structuring Notation Standard Version 3. https://scsc.uk/r141C:1?t=1
- C Michael Holloway. 2008. Safety case notations: Alternatives for the non-graphically inclined?. In 2008 3rd IET International Conference on System Safety. IET, 1–6.
- ChatGPT for good? On opportunities and challenges of large language models for education. Learning and individual differences 103 (2023), 102274.
- Phillip Koopman. 2022. Autonomous vehicles and software safety engineering. In ICSE keynote, May 2022.
- A survey of tool-supported assurance case assessment techniques. Comput. Surveys 52, 5 (2019). https://doi.org/10.1145/3342481
- Assurance case development as data: A manifesto. In 2023 IEEE/ACM 45th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER). IEEE, 135–139.
- Classification, structuring, and assessment of evidence for safety–a systematic literature review. In 2013 IEEE Sixth International Conference on Software Testing, Verification and Validation. IEEE, 94–103.
- R. B. Nelsen. 2001. Kendall tau metric. Encyclopedia of Mathematics (2001).
- OMG. 2021. Structured Assurance Case Metamodel (version 2.2). https://www.omg.org/spec/SACM/2.2/About-SACM
- OpenAI. 2023a. GPT 4. https://openai.com/research/gpt-4
- OpenAI. 2023b. Prompt Engineering. https://platform.openai.com/docs/guides/prompt-engineering
- Understanding the Behaviors of BERT in Ranking. arXiv preprint arXiv:1904.07531 (2019).
- A methodology for automating assurance case generation. arXiv preprint arXiv:2003.05388 (2020).
- Automating Pattern Selection for Assurance Case Development for Cyber-Physical Systems. In International Conference on Computer Safety, Reliability, and Security. Springer, 82–96.
- Sudharsan Ravichandiran. 2021. Getting Started with Google BERT: Build and train state-of-the-art natural language processing models using BERT. Packt Publishing Ltd.
- Model-connected safety cases. In Model-Based Safety and Assessment: 5th International Symposium, IMBSA 2017, Trento, Italy, September 11–13, 2017, Proceedings 5. Vol. 10437 LNCS. 50–63.
- I came, I saw, I certified: some perspectives on the safety assurance of cyber-physical systems. submitted to IEEE Software (2023).
- Tor Stålhane and Thor Myklebust. 2016. The agile safety case. In Computer Safety, Reliability, and Security: SAFECOMP 2016 Workshops, ASSURE, DECSoS, SASSUR, and TIPS, Trondheim, Norway, September 20, 2016, Proceedings 35. Springer, 5–16.
- Supporting Assurance Case Development Using Generative AI. In SAFECOMP 2023, Position Paper.
- Voyager: An open-ended embodied agent with large language models. arXiv preprint arXiv:2305.16291 (2023).
- Computer-Aided Generation of Assurance Cases. In International Conference on Computer Safety, Reliability, and Security. Springer, 135–148.
- Recommending metamodel concepts during modeling activities with pre-trained language models. Software and Systems Modeling 21, 3 (2022), 1071–1089.
Sponsor
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.