Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Unmasking the giant: A comprehensive evaluation of ChatGPT's proficiency in coding algorithms and data structures (2307.05360v3)

Published 10 Jul 2023 in cs.SE, cs.AI, and cs.CL

Abstract: The transformative influence of LLMs is profoundly reshaping the AI technology domain. Notably, ChatGPT distinguishes itself within these models, demonstrating remarkable performance in multi-turn conversations and exhibiting code proficiency across an array of languages. In this paper, we carry out a comprehensive evaluation of ChatGPT's coding capabilities based on what is to date the largest catalog of coding challenges. Our focus is on the python programming language and problems centered on data structures and algorithms, two topics at the very foundations of Computer Science. We evaluate ChatGPT for its ability to generate correct solutions to the problems fed to it, its code quality, and nature of run-time errors thrown by its code. Where ChatGPT code successfully executes, but fails to solve the problem at hand, we look into patterns in the test cases passed in order to gain some insights into how wrong ChatGPT code is in these kinds of situations. To infer whether ChatGPT might have directly memorized some of the data that was used to train it, we methodically design an experiment to investigate this phenomena. Making comparisons with human performance whenever feasible, we investigate all the above questions from the context of both its underlying learning models (GPT-3.5 and GPT-4), on a vast array sub-topics within the main topics, and on problems having varying degrees of difficulty.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Sayed Erfan Arefin (8 papers)
  2. Tasnia Ashrafi Heya (4 papers)
  3. Hasan Al-Qudah (1 paper)
  4. Ynes Ineza (1 paper)
  5. Abdul Serwadda (4 papers)
Citations (5)
X Twitter Logo Streamline Icon: https://streamlinehq.com