Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

OriGen:Enhancing RTL Code Generation with Code-to-Code Augmentation and Self-Reflection (2407.16237v2)

Published 23 Jul 2024 in cs.AR, cs.AI, and cs.LG

Abstract: Recent studies have demonstrated the significant potential of LLMs in generating Register Transfer Level (RTL) code, with notable advancements showcased by commercial models such as GPT-4 and Claude3-Opus. However, these proprietary LLMs often raise concerns regarding privacy and security. While open-source LLMs offer solutions to these concerns, they typically underperform commercial models in RTL code generation tasks, primarily due to the scarcity of high-quality open-source RTL datasets. To address this challenge, we introduce OriGen , a fully open-source framework that incorporates self-reflection capabilities and a novel dataset augmentation methodology for generating high-quality, large-scale RTL code. Our approach employs a code-tocode augmentation technique to enhance the quality of open-source RTL code datasets. Furthermore, OriGen can rectify syntactic errors through a self-reflection process that leverages compiler feedback. Experimental results demonstrate that OriGen significantly outperforms other open-source alternatives in RTL code generation. It surpasses the previous best-performing open-source LLM by 12.8% and even exceeds GPT-4 Turbo in the pass@1 metric on the VerilogEval-Human benchmark. Moreover, OriGen exhibits superior capabilities in self-reflection and error correction, outperforming GPT-4 by 19.9% on a benchmark designed to evaluate self-reflection capabilities.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (36)
  1. Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023).
  2. Anthropic. 2024. Introducing the next generation of Claude. https://www.anthropic.com/news/claude-3-family
  3. Qwen Technical Report. arXiv preprint arXiv:2309.16609 (2023).
  4. Chip-chat: Challenges and opportunities in conversational hardware design. In 2023 ACM/IEEE 5th Workshop on Machine Learning for CAD (MLCAD). IEEE, 1–6.
  5. LegUp: high-level synthesis for FPGA-based processor/accelerator systems. In Proceedings of the 19th ACM/SIGDA international symposium on Field programmable gate arrays. 33–36.
  6. Data is all you need: Finetuning LLMs for Chip Design via an Automated design-data augmentation framework. arXiv preprint arXiv:2403.11202 (2024).
  7. Chipgpt: How far are we from natural language hardware design. arXiv preprint arXiv:2305.14019 (2023).
  8. An introduction to high-level synthesis. IEEE Design & Test of Computers 26, 4 (2009), 8–17.
  9. Steve Dai and Zhiru Zhang. 2019. Improving scalability of exact modulo scheduling with specialized conflict-driven learning. In Proceedings of the 56th Annual Design Automation Conference 2019. 1–6.
  10. A deep learning framework for verilog autocompletion towards design and verification automation. arXiv preprint arXiv:2304.13840 (2023).
  11. Gpt4aigchip: Towards next-generation ai accelerator design automation via large language models. In 2023 IEEE/ACM International Conference on Computer Aided Design (ICCAD). IEEE, 1–9.
  12. DeepSeek-Coder: When the Large Language Model Meets Programming–The Rise of Code Intelligence. arXiv preprint arXiv:2401.14196 (2024).
  13. Hsuan Hsiao and Jason Anderson. 2019. Thread weaving: Static resource scheduling for multithreaded high-level synthesis. In Proceedings of the 56th Annual Design Automation Conference 2019. 1–6.
  14. Lora: Low-rank adaptation of large language models. arXiv preprint arXiv:2106.09685 (2021).
  15. Tensorlib: A spatial accelerator generation framework for tensor algebra. In 2021 58th ACM/IEEE Design Automation Conference (DAC). IEEE, 865–870.
  16. EMS: efficient memory subsystem synthesis for spatial accelerators. In Proceedings of the 59th ACM/IEEE Design Automation Conference. 67–72.
  17. Dynamically scheduled high-level synthesis. In Proceedings of the 2018 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. 127–136.
  18. Ammus: A survey of transformer-based pretrained models in natural language processing. arXiv preprint arXiv:2108.05542 (2021).
  19. Chipnemo: Domain-adapted llms for chip design. arXiv preprint arXiv:2311.00176 (2023).
  20. Verilogeval: Evaluating large language models for verilog code generation. In 2023 IEEE/ACM International Conference on Computer Aided Design (ICCAD). IEEE, 1–8.
  21. Rtlcoder: Outperforming gpt-3.5 in design rtl generation with our open-source dataset and lightweight solution. arXiv preprint arXiv:2312.08617 (2023).
  22. StarCoder 2 and The Stack v2: The Next Generation. arXiv preprint arXiv:2402.19173 (2024).
  23. RTLLM: An open-source benchmark for design rtl generation with large language model. In 2024 29th Asia and South Pacific Design Automation Conference (ASP-DAC). IEEE, 722–727.
  24. Bardia Nadimi and Hao Zheng. 2024. A Multi-Expert Large Language Model Architecture for Verilog Code Generation. arXiv preprint arXiv:2404.08029 (2024).
  25. Dave: Deriving automatically verilog from english. In Proceedings of the 2020 ACM/IEEE Workshop on Machine Learning for CAD. 27–32.
  26. BetterV: Controlled Verilog Generation with Discriminative Guidance. arXiv preprint arXiv:2402.03375 (2024).
  27. Code llama: Open foundation models for code. arXiv preprint arXiv:2308.12950 (2023).
  28. A Survey of Neural Code Intelligence: Paradigms, Advances and Beyond. arXiv:2403.14734
  29. Benchmarking large language models for automated verilog rtl code generation. In 2023 Design, Automation & Test in Europe Conference & Exhibition (DATE). IEEE, 1–6.
  30. Verigen: A large language model for verilog code generation. ACM Transactions on Design Automation of Electronic Systems (2023).
  31. Autochip: Automating hdl generation using llm feedback. arXiv preprint arXiv:2311.04887 (2023).
  32. Rtlfixer: Automatically fixing rtl syntax errors with large language models. arXiv preprint arXiv:2311.16543 (2023).
  33. Stephen Williams and Michael Baxter. 2002. Icarus verilog: open-source verilog more than a year later. Linux Journal 2002, 99 (2002), 3.
  34. Chateda: A large language model powered autonomous agent for eda. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (2024).
  35. HECTOR: A multi-level intermediate representation for hardware synthesis methodologies. In Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design. 1–9.
  36. SWE-agent: Agent Computer Interfaces Enable Software Engineering Language Models.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (12)
  1. Fan Cui (14 papers)
  2. Chenyang Yin (3 papers)
  3. Kexing Zhou (1 paper)
  4. Youwei Xiao (1 paper)
  5. Guangyu Sun (47 papers)
  6. Qiang Xu (129 papers)
  7. Qipeng Guo (72 papers)
  8. Demin Song (11 papers)
  9. Dahua Lin (336 papers)
  10. Xingcheng Zhang (29 papers)
  11. Yun (7 papers)
  12. Liang (6 papers)
Citations (1)