OriGen:Enhancing RTL Code Generation with Code-to-Code Augmentation and Self-Reflection (2407.16237v2)
Abstract: Recent studies have demonstrated the significant potential of LLMs in generating Register Transfer Level (RTL) code, with notable advancements showcased by commercial models such as GPT-4 and Claude3-Opus. However, these proprietary LLMs often raise concerns regarding privacy and security. While open-source LLMs offer solutions to these concerns, they typically underperform commercial models in RTL code generation tasks, primarily due to the scarcity of high-quality open-source RTL datasets. To address this challenge, we introduce OriGen , a fully open-source framework that incorporates self-reflection capabilities and a novel dataset augmentation methodology for generating high-quality, large-scale RTL code. Our approach employs a code-tocode augmentation technique to enhance the quality of open-source RTL code datasets. Furthermore, OriGen can rectify syntactic errors through a self-reflection process that leverages compiler feedback. Experimental results demonstrate that OriGen significantly outperforms other open-source alternatives in RTL code generation. It surpasses the previous best-performing open-source LLM by 12.8% and even exceeds GPT-4 Turbo in the pass@1 metric on the VerilogEval-Human benchmark. Moreover, OriGen exhibits superior capabilities in self-reflection and error correction, outperforming GPT-4 by 19.9% on a benchmark designed to evaluate self-reflection capabilities.
- Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023).
- Anthropic. 2024. Introducing the next generation of Claude. https://www.anthropic.com/news/claude-3-family
- Qwen Technical Report. arXiv preprint arXiv:2309.16609 (2023).
- Chip-chat: Challenges and opportunities in conversational hardware design. In 2023 ACM/IEEE 5th Workshop on Machine Learning for CAD (MLCAD). IEEE, 1–6.
- LegUp: high-level synthesis for FPGA-based processor/accelerator systems. In Proceedings of the 19th ACM/SIGDA international symposium on Field programmable gate arrays. 33–36.
- Data is all you need: Finetuning LLMs for Chip Design via an Automated design-data augmentation framework. arXiv preprint arXiv:2403.11202 (2024).
- Chipgpt: How far are we from natural language hardware design. arXiv preprint arXiv:2305.14019 (2023).
- An introduction to high-level synthesis. IEEE Design & Test of Computers 26, 4 (2009), 8–17.
- Steve Dai and Zhiru Zhang. 2019. Improving scalability of exact modulo scheduling with specialized conflict-driven learning. In Proceedings of the 56th Annual Design Automation Conference 2019. 1–6.
- A deep learning framework for verilog autocompletion towards design and verification automation. arXiv preprint arXiv:2304.13840 (2023).
- Gpt4aigchip: Towards next-generation ai accelerator design automation via large language models. In 2023 IEEE/ACM International Conference on Computer Aided Design (ICCAD). IEEE, 1–9.
- DeepSeek-Coder: When the Large Language Model Meets Programming–The Rise of Code Intelligence. arXiv preprint arXiv:2401.14196 (2024).
- Hsuan Hsiao and Jason Anderson. 2019. Thread weaving: Static resource scheduling for multithreaded high-level synthesis. In Proceedings of the 56th Annual Design Automation Conference 2019. 1–6.
- Lora: Low-rank adaptation of large language models. arXiv preprint arXiv:2106.09685 (2021).
- Tensorlib: A spatial accelerator generation framework for tensor algebra. In 2021 58th ACM/IEEE Design Automation Conference (DAC). IEEE, 865–870.
- EMS: efficient memory subsystem synthesis for spatial accelerators. In Proceedings of the 59th ACM/IEEE Design Automation Conference. 67–72.
- Dynamically scheduled high-level synthesis. In Proceedings of the 2018 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. 127–136.
- Ammus: A survey of transformer-based pretrained models in natural language processing. arXiv preprint arXiv:2108.05542 (2021).
- Chipnemo: Domain-adapted llms for chip design. arXiv preprint arXiv:2311.00176 (2023).
- Verilogeval: Evaluating large language models for verilog code generation. In 2023 IEEE/ACM International Conference on Computer Aided Design (ICCAD). IEEE, 1–8.
- Rtlcoder: Outperforming gpt-3.5 in design rtl generation with our open-source dataset and lightweight solution. arXiv preprint arXiv:2312.08617 (2023).
- StarCoder 2 and The Stack v2: The Next Generation. arXiv preprint arXiv:2402.19173 (2024).
- RTLLM: An open-source benchmark for design rtl generation with large language model. In 2024 29th Asia and South Pacific Design Automation Conference (ASP-DAC). IEEE, 722–727.
- Bardia Nadimi and Hao Zheng. 2024. A Multi-Expert Large Language Model Architecture for Verilog Code Generation. arXiv preprint arXiv:2404.08029 (2024).
- Dave: Deriving automatically verilog from english. In Proceedings of the 2020 ACM/IEEE Workshop on Machine Learning for CAD. 27–32.
- BetterV: Controlled Verilog Generation with Discriminative Guidance. arXiv preprint arXiv:2402.03375 (2024).
- Code llama: Open foundation models for code. arXiv preprint arXiv:2308.12950 (2023).
- A Survey of Neural Code Intelligence: Paradigms, Advances and Beyond. arXiv:2403.14734
- Benchmarking large language models for automated verilog rtl code generation. In 2023 Design, Automation & Test in Europe Conference & Exhibition (DATE). IEEE, 1–6.
- Verigen: A large language model for verilog code generation. ACM Transactions on Design Automation of Electronic Systems (2023).
- Autochip: Automating hdl generation using llm feedback. arXiv preprint arXiv:2311.04887 (2023).
- Rtlfixer: Automatically fixing rtl syntax errors with large language models. arXiv preprint arXiv:2311.16543 (2023).
- Stephen Williams and Michael Baxter. 2002. Icarus verilog: open-source verilog more than a year later. Linux Journal 2002, 99 (2002), 3.
- Chateda: A large language model powered autonomous agent for eda. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (2024).
- HECTOR: A multi-level intermediate representation for hardware synthesis methodologies. In Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design. 1–9.
- SWE-agent: Agent Computer Interfaces Enable Software Engineering Language Models.
- Fan Cui (14 papers)
- Chenyang Yin (3 papers)
- Kexing Zhou (1 paper)
- Youwei Xiao (1 paper)
- Guangyu Sun (47 papers)
- Qiang Xu (129 papers)
- Qipeng Guo (72 papers)
- Demin Song (11 papers)
- Dahua Lin (336 papers)
- Xingcheng Zhang (29 papers)
- Yun (7 papers)
- Liang (6 papers)