- The paper introduces a neural classifier using GraphCodeBert embeddings to differentiate AI-generated code from human-written code, achieving a ROC AUC of 0.964.
- It reveals significant geographic disparities in AI adoption, with the US, Germany, and France leading while other regions lag.
- Empirical analysis shows a 2.4% increase in commit activity and projects a $9.6-$14.4 billion annual economic boost from AI integration in coding.
Global Diffusion and Impact of AI in Coding
Introduction
The document investigates the influence and widespread usage of generative AI tools, specifically in coding. It highlights the uneven diffusion of such technologies and quantifies their impact on productivity and innovation in software development globally. Through a comprehensive empirical analysis, it presents insights into how AI has reshaped programming tasks across various nations and demographics, addressing adoption barriers and estimating economic implications.
Generative AI Classifier Development
The crux of the paper lies in developing a neural classifier to discern AI-generated code from human-written code using GitHub data encompassing 80 million commits from 200,000 developers. The method involves leveraging GraphCodeBert embeddings and a classifier head to isolate AI-associated content effectively. The model is trained with Python functions from both human and AI-generated sources, subsequently achieving notable performance metrics with an out-of-sample ROC AUC Score of 0.964 and Average Precision of 0.969.
Figure 1: Classifying code from functions written in the Python programming language as human or AI generated.
Adoption Trends Across Regions
The paper provides a detailed analysis of AI adoption rates across the globe using the developed classifier. By tracking Python functions over time, it highlights significant disparities between countries, with the US showing the highest adoption rates, closely followed by Germany and France, while countries like China and Russia lag in adoption.

Figure 2: Share of AI-generated Python functions over time, illustrating geographic differences in adoption.
Demographics and AI Use
Among US developers, AI tool adoption is notably higher among newer GitHub users compared to veterans, yet shows no significant gender differences. This finding runs counter to earlier studies that identified gender disparities in technology adoption. It underscores substantial differences in adoption rates based on user tenure.
Figure 3: Intensity of AI use by gender and tenure among GitHub users in the US.
Impact on Productivity and Innovation
Generative AI presence correlates with increased activity and a higher propensity for library exploration. Regression models indicate a 2.4% increase in commit activity and an associated rise in experimentation with new libraries among developers using AI. Additionally, conservative economic estimates place the annual value of generative AI at $9.6-\$14.4 billion, escalating to potential higher figures based on broader productivity gains validated by RCTs and natural experiments.
Figure 4: Effect of AI Share on library imports, commit activity, and AI adoption likelihood.
Discussion
The paper concludes by discussing the broader implications of generative AI adoption in coding, suggesting potential barriers and avenues for further research. The findings provide pivotal insights that inform policymakers about the heterogeneous adoption of AI technologies and their tangible impact on economic productivity and innovation.
Conclusion
This extensive analysis of AI usage in coding unveils significant global disparities in adoption rates, driven by both regulatory and cultural influences. The study establishes the substantial productivity gains possible through AI integration while raising important considerations regarding equitable access and future technological adoption pathways. The insights provided form a foundational step toward understanding the nuanced implications of AI in coding, laying the groundwork for ongoing observation and policy adaptation.