Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Assessing the Answerability of Queries in Retrieval-Augmented Code Generation (2411.05547v2)

Published 8 Nov 2024 in cs.CL

Abstract: Thanks to unprecedented language understanding and generation capabilities of LLM, Retrieval-augmented Code Generation (RaCG) has recently been widely utilized among software developers. While this has increased productivity, there are still frequent instances of incorrect codes being provided. In particular, there are cases where plausible yet incorrect codes are generated for queries from users that cannot be answered with the given queries and API descriptions. This study proposes a task for evaluating answerability, which assesses whether valid answers can be generated based on users' queries and retrieved APIs in RaCG. Additionally, we build a benchmark dataset called Retrieval-augmented Code Generability Evaluation (RaCGEval) to evaluate the performance of models performing this task. Experimental results show that this task remains at a very challenging level, with baseline models exhibiting a low performance of 46.7%. Furthermore, this study discusses methods that could significantly improve performance.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Geonmin Kim (10 papers)
  2. Jaeyeon Kim (42 papers)
  3. Hancheol Park (5 papers)
  4. Wooksu Shin (2 papers)
  5. Tae-Ho Kim (15 papers)

Summary

We haven't generated a summary for this paper yet.