Context-aware length of autocomplete code suggestions

Determine the appropriate, context-aware generation length for code suggestions produced by autocomplete-based large language models in integrated development environments, so that the suggestions are complete and useful without being excessively long or off-topic relative to the programmer’s current code and task intent.

Background

Within RealHumanEval, autocomplete-based assistance often produced suggestions that were either fragmentary (e.g., incomplete variable definitions or function implementations) or overly long and diverging from the task, which disrupted users’ workflows. These problems arise because LLMs typically require a maximum generation length to terminate suggestions and do not inherently know when to stop.

To study suggestion length effects, the authors randomized the maximum suggestion length for autocomplete between 10 and 120 tokens (mean 64) but still observed issues with incomplete or overly verbose suggestions. As a result, identifying a principled, context-aware policy for suggestion length that balances completeness and concision remains unresolved.

References

It is an open question to determine the appropriate length for how much code to generate in a context-aware manner.

— The RealHumanEval: Evaluating Large Language Models' Abilities to Support Programmers (2404.02806 - Mozannar et al., 3 Apr 2024) in Section: Design Opportunities, Autocomplete-specific suggestions

Context-aware length of autocomplete code suggestions

Sponsor

Background

References

Related Problems