Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
144 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Combining Program Analysis and Statistical Language Model for Code Statement Completion (1911.07781v1)

Published 18 Nov 2019 in cs.SE

Abstract: Automatic code completion helps improve developers' productivity in their programming tasks. A program contains instructions expressed via code statements, which are considered as the basic units of program execution. In this paper, we introduce AutoSC, which combines program analysis and the principle of software naturalness to fill in a partially completed statement. AutoSC benefits from the strengths of both directions, in which the completed code statement is both frequent and valid. AutoSC is first trained on a large code corpus to derive the templates of candidate statements. Then, it uses program analysis to validate and concretize the templates into syntactically and type-valid candidate statements. Finally, these candidates are ranked by using a LLM trained on the lexical form of the source code in the code corpus. Our empirical evaluation on the large datasets of real-world projects shows that AutoSC achieves 38.9-41.3% top-1 accuracy and 48.2-50.1% top-5 accuracy in statement completion. It also outperforms a state-of-the-art approach from 9X-69X in top-1 accuracy.

Citations (26)

Summary

We haven't generated a summary for this paper yet.