Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

AugmentedCode: Examining the Effects of Natural Language Resources in Code Retrieval Models (2110.08512v1)

Published 16 Oct 2021 in cs.SE and cs.AI

Abstract: Code retrieval is allowing software engineers to search codes through a natural language query, which relies on both natural language processing and software engineering techniques. There have been several attempts on code retrieval from searching snippet codes to function codes. In this paper, we introduce Augmented Code (AugmentedCode) retrieval which takes advantage of existing information within the code and constructs augmented programming language to improve the code retrieval models' performance. We curated a large corpus of Python and showcased the the framework and the results of augmented programming language which outperforms on CodeSearchNet and CodeBERT with a Mean Reciprocal Rank (MRR) of 0.73 and 0.96, respectively. The outperformed fine-tuned augmented code retrieval model is published in HuggingFace at https://huggingface.co/Fujitsu/AugCode and a demonstration video is available at: https://youtu.be/mnZrUTANjGs .

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Mehdi Bahrami (7 papers)
  2. N. C. Shrikanth (6 papers)
  3. Yuji Mizobuchi (2 papers)
  4. Lei Liu (332 papers)
  5. Masahiro Fukuyori (2 papers)
  6. Wei-Peng Chen (6 papers)
  7. Kazuki Munakata (5 papers)
Citations (2)

Summary

We haven't generated a summary for this paper yet.