InverseCoder: Self-improving Instruction-Tuned Code LLMs with Inverse-Instruct (2407.05700v2)

Published 8 Jul 2024 in cs.CL, cs.AI, and cs.SE

Abstract: Recent advancements in open-source code LLMs have been driven by fine-tuning on the data generated from powerful closed-source LLMs, which are expensive to obtain. This paper explores whether it is possible to use a fine-tuned open-source model to generate additional data to augment its instruction-tuning dataset. We make two observations: (1) A code snippet can serve as the response to different instructions. (2) Instruction-tuned code LLMs perform better at translating code into instructions than the reverse. Based on these observations, we propose Inverse-Instruct, a data augmentation technique that uses a fine-tuned LLM to generate additional instructions of code responses from its own training dataset. The additional instruction-response pairs are added to the original dataset, and a stronger code LLM can be obtained by fine-tuning on the augmented dataset. We empirically validate Inverse-Instruct on a range of open-source code models (e.g. CodeLlama-Python and DeepSeek-Coder) and benchmarks (e.g., HumanEval(+), MBPP(+), DS-1000 and MultiPL-E), showing it consistently improves the base models.

PDF HTML Abstract

Summarize PDF Markdown Bookmark Chat (Pro)

References (63)

Authors (16)

Yutong Wu (25 papers)
Di Huang (203 papers)
Wenxuan Shi (7 papers)
Wei Wang (1793 papers)
Lingzhe Gao (1 paper)
Shihao Liu (10 papers)
Ziyuan Nan (5 papers)
Kaizhao Yuan (3 papers)
Rui Zhang (1138 papers)
Xishan Zhang (22 papers)
Zidong Du (41 papers)
Qi Guo (237 papers)
Yewen Pu (27 papers)
Dawei Yin (165 papers)
Xing Hu (122 papers)
Yunji Chen (51 papers)

Tweets

https://twitter.com/GptMaestro/status/1811701643018404110

YouTube

Show All Videos

InverseCoder: Self-improving Instruction-Tuned Code LLMs with Inverse-Instruct (2407.05700v2)

Related Papers

Tweets

YouTube