VeriGen: A Large Language Model for Verilog Code Generation (2308.00708v1)

Published 28 Jul 2023 in cs.PL, cs.LG, and cs.SE

Abstract: In this study, we explore the capability of LLMs to automate hardware design by generating high-quality Verilog code, a common language for designing and modeling digital systems. We fine-tune pre-existing LLMs on Verilog datasets compiled from GitHub and Verilog textbooks. We evaluate the functional correctness of the generated Verilog code using a specially designed test suite, featuring a custom problem set and testing benches. Here, our fine-tuned open-source CodeGen-16B model outperforms the commercial state-of-the-art GPT-3.5-turbo model with a 1.1% overall increase. Upon testing with a more diverse and complex problem set, we find that the fine-tuned model shows competitive performance against state-of-the-art gpt-3.5-turbo, excelling in certain scenarios. Notably, it demonstrates a 41% improvement in generating syntactically correct Verilog code across various problem categories compared to its pre-trained counterpart, highlighting the potential of smaller, in-house LLMs in hardware design automation.

PDF Abstract

Summarize PDF Markdown Bookmark Chat (Pro)

Authors (7)

Shailja Thakur (12 papers)
Baleegh Ahmad (9 papers)
Hammond Pearce (35 papers)
Benjamin Tan (42 papers)
Brendan Dolan-Gavitt (24 papers)
Ramesh Karri (92 papers)
Siddharth Garg (99 papers)

Citations (84)

View on Semantic Scholar

VeriGen: A Large Language Model for Verilog Code Generation (2308.00708v1)

Related Papers