Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

CoCoMIC: Code Completion By Jointly Modeling In-file and Cross-file Context (2212.10007v2)

Published 20 Dec 2022 in cs.CL and cs.SE

Abstract: While pre-trained LLMs (LM) for code have achieved great success in code completion, they generate code conditioned only on the contents within the file, i.e., in-file context, but ignore the rich semantics in other files within the same project, i.e., cross-file context, a critical source of information that is especially useful in modern modular software development. Such overlooking constrains code LLMs' capacity in code completion, leading to unexpected behaviors such as generating hallucinated class member functions or function calls with unexpected arguments. In this work, we develop a cross-file context finder tool, CCFINDER, that effectively locates and retrieves the most relevant cross-file context. We propose CoCoMIC, a framework that incorporates cross-file context to learn the in-file and cross-file context jointly on top of pretrained code LMs. CoCoMIC successfully improves the existing code LM with a 33.94% relative increase in exact match and a 28.69% relative increase in identifier matching for code completion when the cross-file context is provided.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Yangruibo Ding (17 papers)
  2. Zijian Wang (99 papers)
  3. Wasi Uddin Ahmad (41 papers)
  4. Murali Krishna Ramanathan (13 papers)
  5. Ramesh Nallapati (38 papers)
  6. Parminder Bhatia (50 papers)
  7. Dan Roth (222 papers)
  8. Bing Xiang (74 papers)
Citations (55)