Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 71 tok/s
Gemini 2.5 Pro 54 tok/s Pro
GPT-5 Medium 22 tok/s Pro
GPT-5 High 29 tok/s Pro
GPT-4o 88 tok/s Pro
Kimi K2 138 tok/s Pro
GPT OSS 120B 446 tok/s Pro
Claude Sonnet 4.5 35 tok/s Pro
2000 character limit reached

Unveiling the Role of ChatGPT in Software Development: Insights from Developer-ChatGPT Interactions on GitHub (2505.03901v1)

Published 6 May 2025 in cs.SE

Abstract: The advent of LLMs has introduced a new paradigm in software engineering, with generative AI tools like ChatGPT gaining widespread adoption among developers. While ChatGPT's potential has been extensively discussed, there is limited empirical evidence exploring its real-world usage by developers. This study bridges this gap by conducting a large-scale empirical analysis of ChatGPT-assisted development activities, leveraging a curated dataset, DevChat, comprising 2,547 unique shared ChatGPT links collected from GitHub between May 2023 and June 2024. Our study examines the characteristics of ChatGPT's usage on GitHub (including the tendency, prompt turns distribution, and link descriptions) and identifies five categories of developers' purposes for sharing developer-ChatGPT conversations during software development. Additionally, we analyzed the development-related activities where developers shared ChatGPT links to facilitate their workflows. We then established a mapping framework among data sources, activities, and SE tasks associated with these shared ChatGPT links. Our study offers a comprehensive view of ChatGPT's application in real-world software development scenarios and provides a foundation for its future integration into software development workflows.

Summary

  • The paper demonstrates ChatGPT's integration into software development using the DevChat dataset with 2,547 unique GitHub links.
  • It shows that ChatGPT is predominantly used for code generation (43.4%) and commits (32.3%), highlighting its role in task delegation.
  • Findings emphasize the tool's impact on collaboration by providing contextual descriptions in commits and pull requests.

Unveiling the Role of ChatGPT in Software Development

This essay explores the findings and implications of the paper "Unveiling the Role of ChatGPT in Software Development: Insights from Developer-ChatGPT Interactions on GitHub" (2505.03901), which provides a comprehensive account of how developers engage with ChatGPT in software development, focusing heavily on characteristics and patterns of usage on GitHub. The authors introduce the DevChat dataset, analyze shared ChatGPT links, and elucidate developers' purposes and workflow integration using ChatGPT.

Characteristics of ChatGPT Usage on GitHub

The paper leverages the DevChat dataset, which contains 2,547 unique shared ChatGPT links from GitHub, providing insights into developers' interactions with ChatGPT. Notably, these interactions predominantly occur in Code (43.4%) and Commits (32.3%), with a peak in usage following the introduction of ChatGPT's sharing feature in May 2023. Figure 1

Figure 1: Data distribution of shared ChatGPT links on GitHub.

The sharp increase in adoption, peaking in August 2023, underscores the importance and appeal of ChatGPT in real-world scenarios. The prompt turns distribution highlights that most interactions involve short, task-focused prompts, though some engage in multi-turn dialogues for more complex problems. Figure 2

Figure 2: Prompt turns distribution of the developer-ChatGPT interactions during software development on GitHub.

A significant observation is that the majority of shared ChatGPT links include contextual descriptions, improving collaboration via clarity and transparency in discussions, especially potent in Commits and Pull Requests. This reflects a cultural shift wherein AI-assisted insights significantly supplement human expertise within GitHub workflows.

Developers' Purposes for Sharing ChatGPT Conversations

The paper identifies five primary categories of purposes for sharing ChatGPT interactions during software development: Task Delegation, Problem Resolution, Knowledge Acquisition, Solution Recommendation, and Concept Interpretation. Task Delegation emerged as the dominant purpose, particularly in scenarios requiring repetitive task automation like boilerplate code generation. Figure 3

Figure 3: Developers' purposes for sharing ChatGPT links from the five sources.

Conversely, Problem Resolution and Concept Interpretation received less focus, reflecting the current limitations of ChatGPT in dealing with context-dependent challenges that still require substantial human expert analysis. In Discussion and Issues, significant engagement was observed in Knowledge Acquisition and Solution Recommendation, illustrating ChatGPT's value in collaborative ideation.

The paper provides an in-depth mapping of ChatGPT's role across development activities, revealing that code development and maintenance are the primary activities where ChatGPT plays a significant role. Figure 4

Figure 4: Distribution of development-related activities involving shared ChatGPT links from five data sources.

Software Development emerged as the most extensive engagement activity, affirming ChatGPT's utility in generating and modifying code. Additionally, Software Maintenance and Evolution benefitted from ChatGPT's capabilities to refine and optimize codebases iteratively. Figure 5

Figure 5: Heat map of the development-related activities involving shared ChatGPT links from five data sources.

SE Tasks of ChatGPT Conversations in Software Development

The mapping of ChatGPT interactions underscores its strong presence in automating coding-related tasks, notably Code Generation and Completion, and Code Modification and Optimization. However, early-stage requirement tasks like Requirements Elicitation are underrepresented, indicating potential gaps in ChatGPT's integration into comprehensive system design or requirements analysis processes. Figure 6

Figure 6: Mapping relationships among data sources, development-related activities, and SE tasks.

The allocation of specific tasks within each activity not only provides insight into current ChatGPT usage trends but also helps identify where future tool development might focus to broaden ChatGPT's applicability across various phases of the software development life cycle.

Conclusion

This paper offers a quantitative and qualitative understanding of ChatGPT's integration into software development through extensive empirical analysis of GitHub interactions. While ChatGPT is prevalently used for code-centric tasks, the findings prompt a reevaluation of how AI can augment activities traditionally dependent on human judgment. Future research should focus on enhancing ChatGPT's capabilities in broader domains like architectural design and testing while maintaining robust measures for accuracy and efficiency in automated tasks. As the integration of GenAI tools into software development deepens, adapting development processes and standards to encompass AI-driven collaboration will be crucial for maximizing the potential of such technologies.

Ai Generate Text Spark Streamline Icon: https://streamlinehq.com

Explain it Like I'm 14

Overview

This paper looks at how real software developers use ChatGPT while working on code. The authors collected and studied thousands of public links to ChatGPT conversations that developers shared on GitHub. Their goal was to understand when, why, and how ChatGPT is used in everyday programming tasks.

Key Objectives

To make things simple, the paper asked four main questions:

  • How are developers using ChatGPT during software development?
  • Why do developers share their ChatGPT conversations on GitHub?
  • In what kinds of development activities do these shared conversations show up?
  • What specific software tasks are ChatGPT helping with?

Methods and Approach

Here’s how the researchers did it, explained with everyday ideas:

  • Finding the conversations: Since May 2023, ChatGPT has let users share a link to their chat. The team searched GitHub (a site where developers store and discuss code) for these shared links in five places:
    • Code (files and comments inside repositories)
    • Commits (the “save points” of code changes)
    • Issues (bug reports or task tickets)
    • Pull Requests (code change reviews before merging)
    • Discussions (forum-like posts)
  • Getting past limits: GitHub searches can only show so many results at once. To work around this, the team split searches by programming language and time (month-by-month) and used both the official GitHub API and a web crawler when needed.
  • Cleaning the data: They removed broken links, non-English content, and duplicates. After this careful filtering, they ended up with a clean dataset called DevChat containing 2,547 unique shared ChatGPT links.
  • Understanding the chats: They looked at things like:
    • “Prompt turns” (how many back-and-forth messages between developer and ChatGPT—like a text conversation thread)
    • Whether people included a description explaining the shared link
    • What the conversation was used for (purpose), what activity it related to, and what specific software task it helped with
  • Sorting and grouping (the “Constant Comparison” method): Think of this like sorting lots of sticky notes into labeled piles. The authors read many conversations, created simple labels (like “generate code” or “fix a bug”), grouped similar ones together, and then connected how these groups relate. They also checked each other’s labeling to keep it reliable.

Main Findings and Why They Matter

  • Most links came from Code (43.4%) and Commits (32.3%). That means people mainly used and shared ChatGPT while writing or changing code.
  • The number of shared links peaked around August 2023 and kept growing over time—showing strong interest and adoption.

Why it matters: This suggests ChatGPT is becoming a regular tool in coding and version control, not just something used in discussions or planning.

How they talk to ChatGPT

  • Many conversations were short and focused: most common were two or three messages back-and-forth.
  • Longer chats happened more in Code, which makes sense for tricky tasks like debugging or step-by-step code generation.

Why it matters: ChatGPT is often used like a quick helper for specific tasks, and sometimes as a deeper partner for complex coding.

  • Most shared links had a description or context (about 83%), especially in Commits (almost all had descriptions), Pull Requests, and Discussions.
  • In Code, descriptions were more mixed—some had helpful context, some didn’t.

Why it matters: Good descriptions help teammates understand why the chat is useful, especially in collaborative workflows.

Why developers share these chats (five main purposes)

  • Task Delegation: The most common purpose. Developers ask ChatGPT to do repetitive tasks like generating boilerplate code or writing a README.
  • Problem Resolution: Using ChatGPT to debug or fix errors—more common in Issues and Commits.
  • Knowledge Acquisition: Learning or clarifying concepts—more common in Discussions and Issues.
  • Solution Recommendation: Getting suggestions, best practices, or approaches—also more common in Discussions and Issues.
  • Concept Interpretation: Asking ChatGPT to explain ideas or structures—used less overall, but helpful for quick understanding.

Why it matters: ChatGPT isn’t just a code generator—it also teaches, recommends, and troubleshoots. But most shared uses focus on “getting tasks done.”

What development activities and tasks are involved

  • Activities covered both standard software stages (like requirements, design, development, testing, deployment, maintenance) and supporting activities (like handling data and searching for info).
  • The most common activities were Software Development and Software Maintenance & Evolution.
  • They mapped 39 specific software engineering tasks. The top ones were:
    • Code Generation and Completion (making and finishing code)
    • Code Modification and Optimization (improving or changing existing code)

Why it matters: This gives a clear picture of where ChatGPT fits best today—hands-on programming and improving existing code.

Implications and Potential Impact

  • Better tools and workflows: Knowing how developers use ChatGPT can help teams design smarter tools and processes (for example, easier ways to attach chat links to commits with clear descriptions).
  • Productivity boost: Since many uses are short, specific tasks, integrating ChatGPT into editors and code review tools can save time without much overhead.
  • Human review still needed: Because ChatGPT can be wrong or miss context, developers should keep checking results—especially for tricky bugs or important changes.
  • Training and onboarding: ChatGPT’s role in knowledge and solution recommendations suggests it can help new team members learn faster.
  • Future research: The DevChat dataset offers a foundation for deeper studies, like how prompts affect code quality, or which activities benefit most from AI assistance.

In short, ChatGPT is becoming a practical, everyday assistant in software development, especially for writing, editing, and fixing code. With good practices and human oversight, it can make teams faster and more effective.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube