Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
158 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

S3LLM: Large-Scale Scientific Software Understanding with LLMs using Source, Metadata, and Document (2403.10588v1)

Published 15 Mar 2024 in cs.SE and cs.AI

Abstract: The understanding of large-scale scientific software poses significant challenges due to its diverse codebase, extensive code length, and target computing architectures. The emergence of generative AI, specifically LLMs, provides novel pathways for understanding such complex scientific codes. This paper presents S3LLM, an LLM-based framework designed to enable the examination of source code, code metadata, and summarized information in conjunction with textual technical reports in an interactive, conversational manner through a user-friendly interface. S3LLM leverages open-source LLaMA-2 models to enhance code analysis through the automatic transformation of natural language queries into domain-specific language (DSL) queries. Specifically, it translates these queries into Feature Query Language (FQL), enabling efficient scanning and parsing of entire code repositories. In addition, S3LLM is equipped to handle diverse metadata types, including DOT, SQL, and customized formats. Furthermore, S3LLM incorporates retrieval augmented generation (RAG) and LangChain technologies to directly query extensive documents. S3LLM demonstrates the potential of using locally deployed open-source LLMs for the rapid understanding of large-scale scientific computing software, eliminating the need for extensive coding expertise, and thereby making the process more efficient and effective. S3LLM is available at https://github.com/ResponsibleAILab/s3LLM.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (33)
  1. CLOC. https://github.com/AlDanial/cloc. Accessed: 2024-02-27.
  2. Doxygen. https://www.doxygen.nl/. Accessed: 2024-03-01.
  3. Fossology. https://www.fossology.org/. Accessed: 2024-02-27.
  4. Oss review toolkit. https://github.com/oss-review-toolkit/ort. Accessed: 2024-02-27.
  5. ScanCode Toolkit. https://github.com/nexB/scancode-toolkit. Accessed: 2024-02-27.
  6. SLOCCount User’s Guide. https://dwheeler.com/sloccount/sloccount.html. Accessed: 2024-02-27.
  7. Sonar. https://www.sonarsource.com/products/sonarqube/. Accessed: 2024-02-27.
  8. Sphinx. https://www.sphinx-doc.org/en/master/. Accessed: 2024-03-01.
  9. Llm based generation of item-description for recommendation system. In Proceedings of the 17th ACM Conference on Recommender Systems, pages 1204–1207, 2023.
  10. Fine-tuning a llm using reinforcement learning from human feedback for a therapy chatbot application, 2023.
  11. Detect-localize-repair: A unified framework for learning to debug with codet5. arXiv preprint arXiv:2211.14875, 2022.
  12. Booookscore: A systematic exploration of book-length summarization in the era of llms. arXiv preprint arXiv:2310.00785, 2023.
  13. Self-collaboration code generation via chatgpt. arXiv preprint arXiv:2304.07590, 2023.
  14. Investigating code generation performance of chat-gpt with crowdsourcing social data. In Proceedings of the 47th IEEE Computer Software and Applications Conference, pages 1–10, 2023.
  15. Chatbots in knowledge-intensive contexts: Comparing intent and llm-based systems. arXiv preprint arXiv:2402.04955, 2024.
  16. The DOE E3SM model version 2: Overview of the physical model and initial model evaluation. Journal of Advances in Modeling Earth Systems, 14(12):e2022MS003156, 2022. e2022MS003156 2022MS003156.
  17. Learning preference model for llms via automatic preference data generation. In The 2023 Conference on Empirical Methods in Natural Language Processing, 2023.
  18. Evaluating llm-generated worked examples in an introductory programming course. In Proceedings of the 26th Australasian Computing Education Conference, pages 77–86, 2024.
  19. Explainable automated debugging via large language model-driven scientific debugging. arXiv preprint arXiv:2304.02195, 2023.
  20. Retrieval-augmented generation for knowledge-intensive nlp tasks. Advances in Neural Information Processing Systems, 33:9459–9474, 2020.
  21. Technical description of version 4.0 of the community land model (clm). NCAR Tech. Note NCAR/TN-478+ STR, 257:1–257, 2010.
  22. SPEL: Software tool for porting e3sm land model with openacc in a function unit test framework. In 2022 Workshop on Accelerator Programming Using Directives (WACCPD), pages 43–51. IEEE, 2022.
  23. Still confusing for bug-component triaging? deep feature learning and ensemble setting to rescue. In 2023 IEEE/ACM 31st International Conference on Program Comprehension (ICPC), pages 316–327. IEEE, 2023.
  24. Variable discovery with large language models for metamorphic testing of scientific software. In International Conference on Computational Science, pages 321–335. Springer, 2023.
  25. Clinical text summarization: Adapting large language models can outperform human experts. Research Square, 2023.
  26. Can large language models write good property-based tests? arXiv preprint arXiv:2307.04346, 2023.
  27. Toward ultrahigh-resolution e3sm land modeling on exascale computers. Computing in Science & Engineering, 24(6):44–53, 2022.
  28. No more manual tests? evaluating and improving chatgpt for unit test generation. arXiv preprint arXiv:2305.04207, 2023.
  29. Benchmarking large language models for news summarization. Transactions of the Association for Computational Linguistics, 12:39–57, 2024.
  30. Cupid: Leveraging chatgpt for more accurate duplicate bug report detection. arXiv preprint arXiv:2308.10022, 2023.
  31. Judging llm-as-a-judge with mt-bench and chatbot arena. Advances in Neural Information Processing Systems, 36, 2024.
  32. XScan: an integrated tool for understanding open source community-based scientific code. In International Conference on Computational Science, pages 226–237. Springer, 2019.
  33. FQL: An extensible feature query language and toolkit on searching software characteristics for hpc applications. In Guido Juckeland and Sunita Chandrasekaran, editors, Tools and Techniques for High Performance Computing, pages 129–142, Cham, 2020. Springer International Publishing.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com