HumanEval on Latest GPT Models -- 2024 (2402.14852v1)

Published 20 Feb 2024 in cs.CL, cs.AI, and cs.LG

Abstract: In 2023, we are using the latest models of GPT-4 to advance program synthesis. The LLMs have significantly improved the state-of-the-art for this purpose. To make these advancements more accessible, we have created a repository that connects these models to Huamn Eval. This dataset was initally developed to be used with a LLM called CODEGEN on natural and programming language data. The utility of these trained models is showcased by demonstrating their competitive performance in zero-shot Python code generation on HumanEval tasks compared to previous state-of-the-art solutions. Additionally, this gives way to developing more multi-step paradigm synthesis. This benchmark features 160 diverse problem sets factorized into multistep prompts that our analysis shows significantly improves program synthesis over single-turn inputs. All code is open source at https://github.com/daniel442li/gpt-human-eval .

PDF HTML Abstract

Summarize PDF Markdown Bookmark Chat (Pro)

References (15)

Authors (2)

Daniel Li (42 papers)
Lincoln Murr (3 papers)

Citations (2)

View on Semantic Scholar

Tweets

https://twitter.com/valexiev1/status/1785027225856905435

HumanEval on Latest GPT Models -- 2024 (2402.14852v1)

Related Papers

Tweets