CodeScholar: Growing Idiomatic Code Examples (2312.15157v1)
Abstract: Programmers often search for usage examples for API methods. A tool that could generate realistic, idiomatic, and contextual usage examples for one or more APIs would be immensely beneficial to developers. Such a tool would relieve the need for a deep understanding of the API landscape, augment existing documentation, and help discover interactions among APIs. We present CodeScholar, a tool that generates idiomatic code examples demonstrating the common usage of API methods. It includes a novel neural-guided search technique over graphs that grows the query APIs into idiomatic code examples. Our user study demonstrates that in 70% of cases, developers prefer CodeScholar generated examples over state-of-the-art LLMs (LLM) like GPT3.5. We quantitatively evaluate 60 single and 25 multi-API queries from 6 popular Python libraries and show that across-the-board CodeScholar generates more realistic, diverse, and concise examples. In addition, we show that CodeScholar not only helps developers but also LLM-powered programming assistants generate correct code in a program synthesis setting.
- Guiding Language Models of Code with Global Context using Monitors. arXiv preprint arXiv:2306.10763 (2023).
- Miltiadis Allamanis and Charles Sutton. 2014. Mining idioms from source code. In Proceedings of the 22nd acm sigsoft international symposium on foundations of software engineering. 472–483.
- Program synthesis with large language models. arXiv preprint arXiv:2108.07732 (2021).
- Exempla Gratis (E.G.): Code Examples for Free. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (Virtual Event, USA) (ESEC/FSE 2020). Association for Computing Machinery, New York, NY, USA, 1353–1364. https://doi.org/10.1145/3368089.3417052
- Apache Lucene 4. In OSIR@SIGIR. https://api.semanticscholar.org/CorpusID:17420900
- R Bisiani. 1987. Beam Search: Encyclopedia of Artificial Intelligence, SC Shapiro (ed.): 56-58.
- Two studies of opportunistic programming: interleaving web foraging, learning, and writing code. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 1589–1598.
- Language models are few-shot learners. Advances in neural information processing systems 33 (2020), 1877–1901.
- Sparks of artificial general intelligence: Early experiments with gpt-4. arXiv preprint arXiv:2303.12712 (2023).
- Do programmers prefer predictable expressions in code? Cognitive science 44, 12 (2020), e12921.
- Evaluating large language models trained on code. arXiv preprint arXiv:2107.03374 (2021).
- Stephen A Cook. 1971. The complexity of theorem-proving procedures. In Proceedings of the third annual ACM symposium on Theory of computing. 151–158.
- A (sub) graph isomorphism algorithm for matching large graphs. IEEE transactions on pattern analysis and machine intelligence 26, 10 (2004), 1367–1372.
- Derek G Corneil and Calvin C Gotlieb. 1970. An efficient algorithm for graph isomorphism. Journal of the ACM (JACM) 17, 1 (1970), 51–64.
- Luca Di Grazia and Michael Pradel. 2023. Code search: A survey of techniques for finding code. Comput. Surveys 55, 11 (2023), 1–31.
- CodeBERT: A Pre-Trained Model for Programming and Natural Languages. In Findings of the Association for Computational Linguistics: EMNLP 2020. Association for Computational Linguistics, Online, 1536–1547. https://doi.org/10.18653/v1/2020.findings-emnlp.139
- Brian Gallagher. 2006. Matching Structure and Semantics: A Survey on Graph-Based Pattern Matching.. In AAAI Fall Symposium: Capturing and Using Patterns for Evidence Detection, Vol. 45.
- Visualizing API usage examples at scale. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. 1–12.
- Deep code search. In Proceedings of the 40th International Conference on Software Engineering. 933–944.
- Measuring coding challenge competence with apps. arXiv preprint arXiv:2105.09938 (2021).
- On the naturalness of software. Commun. ACM 59, 5 (2016), 122–131.
- Summarizing Software API Usage Examples Using Clustering Techniques.. In FASE. 189–206.
- FaCoY: a code-to-code search engine. In Proceedings of the 40th International Conference on Software Engineering. 946–957.
- Codegenie: using test-cases to search and reuse source code. In Proceedings of the 22nd IEEE/ACM international conference on Automated software engineering. 525–526.
- Taskmatrix. ai: Completing tasks by connecting foundation models with millions of apis. arXiv preprint arXiv:2303.16434 (2023).
- Neural subgraph matching. arXiv preprint arXiv:2007.03092 (2020).
- Aroma: Code recommendation via structural code search. Proceedings of the ACM on Programming Languages 3, OOPSLA (2019), 1–28.
- Codehow: Effective code search based on api understanding and extended boolean model (e). In 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 260–270.
- Jungloid mining: helping to navigate the API jungle. ACM Sigplan Notices 40, 6 (2005), 48–61.
- Brian McFee and Gert Lanckriet. 2009. Partial order embedding with multiple kernels. In Proceedings of the 26th Annual International Conference on Machine Learning. 721–728.
- Exemplar: A source code search engine for finding highly relevant applications. IEEE Transactions on Software Engineering 38, 5 (2011), 1069–1087.
- Portfolio: finding relevant functions and their usage. In Proceedings of the 33rd International Conference on Software Engineering. 111–120.
- Documenting apis with examples: Lessons learned with the apiminer platform. In 2013 20th working conference on reverse engineering (WCRE). IEEE, 401–408.
- API code recommendation using statistical learning from fine-grained changes. In Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering. 511–522.
- Graph-based pattern-oriented, context-sensitive source code completion. In 2012 34th International Conference on Software Engineering (ICSE). IEEE, 69–79.
- Complementing global and local contexts in representing API descriptions to improve API retrieval tasks. In Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 551–562.
- Graph-based mining of multiple object usage patterns. In Proceedings of the 7th joint meeting of the European Software Engineering Conference and the ACM SIGSOFT symposium on the Foundations of Software Engineering. 383–392.
- Learning API usages from bytecode: A statistical approach. In Proceedings of the 38th International Conference on Software Engineering. 416–427.
- OpenAI. 2023. GPT-4 Technical Report. arXiv:2303.08774 [cs.CL]
- Gorilla: Large language model connected with massive apis. arXiv preprint arXiv:2305.15334 (2023).
- Chanchal Kumar Roy and James R Cordy. 2007. A survey on software clone detection research. Queen’s School of computing TR 541, 115 (2007), 64–68.
- Code llama: Open foundation models for code. arXiv preprint arXiv:2308.12950 (2023).
- The earth mover’s distance as a metric for image retrieval. International journal of computer vision 40 (2000), 99–121.
- Retrieval on source code: a neural code search. In Proceedings of the 2nd ACM SIGPLAN International Workshop on Machine Learning and Programming Languages. 31–41.
- How developers search for code: a case study. In Proceedings of the 2015 10th joint meeting on foundations of software engineering. 191–201.
- Mining multi-level API usage patterns. In 2015 IEEE 22nd international conference on software analysis, evolution, and reengineering (SANER). IEEE, 23–32.
- Toolformer: Language models can teach themselves to use tools. arXiv preprint arXiv:2302.04761 (2023).
- Neural Machine Translation of Rare Words with Subword Units. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Berlin, Germany, 1715–1725. https://doi.org/10.18653/v1/P16-1162
- Hugginggpt: Solving ai tasks with chatgpt and its friends in huggingface. arXiv preprint arXiv:2303.17580 (2023).
- Weisfeiler-lehman graph kernels. Journal of Machine Learning Research 12, 9 (2011).
- Augmenting and structuring user queries to support efficient free-form code search. In Proceedings of the 40th international conference on software engineering. 945–945.
- Working with search results. In 2009 ICSE Workshop on Search-Driven Development-Users, Infrastructure, Tools and Evaluation. IEEE, 53–56.
- HotGPT: How to Make Software Documentation More Useful with a Large Language Model?. In Proceedings of the 19th Workshop on Hot Topics in Operating Systems. 87–93.
- Efficient subgraph matching on billion node graphs. arXiv preprint arXiv:1205.6691 (2012).
- Llama 2: Open Foundation and Fine-Tuned Chat Models. arXiv preprint arXiv:2307.09288 (2023).
- Julian R Ullmann. 1976. An algorithm for subgraph isomorphism. Journal of the ACM (JACM) 23, 1 (1976), 31–42.
- Can Large Language Models Write Good Property-Based Tests? arXiv preprint arXiv:2307.04346 (2023).
- Matching dependence-related queries in the system dependence graph. In Proceedings of the 25th IEEE/ACM International Conference on Automated Software Engineering. 457–466.
- Execution-based evaluation for open-domain code generation. arXiv preprint arXiv:2212.10481 (2022).
- Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35 (2022), 24824–24837.
- Manish Shetty (12 papers)
- Koushik Sen (49 papers)
- Ion Stoica (177 papers)