Working with AI: Measuring the Occupational Implications of Generative AI (2507.07935v1)

Published 10 Jul 2025 in cs.AI, cs.CY, econ.GN, and q-fin.EC

Abstract: Given the rapid adoption of generative AI and its potential to impact a wide range of tasks, understanding the effects of AI on the economy is one of society's most important questions. In this work, we take a step toward that goal by analyzing the work activities people do with AI, how successfully and broadly those activities are done, and combine that with data on what occupations do those activities. We analyze a dataset of 200k anonymized and privacy-scrubbed conversations between users and Microsoft Bing Copilot, a publicly available generative AI system. We find the most common work activities people seek AI assistance for involve gathering information and writing, while the most common activities that AI itself is performing are providing information and assistance, writing, teaching, and advising. Combining these activity classifications with measurements of task success and scope of impact, we compute an AI applicability score for each occupation. We find the highest AI applicability scores for knowledge work occupation groups such as computer and mathematical, and office and administrative support, as well as occupations such as sales whose work activities involve providing and communicating information. Additionally, we characterize the types of work activities performed most successfully, how wage and education correlate with AI applicability, and how real-world usage compares to predictions of occupational AI impact.

Summary

The paper introduces a novel empirical framework to map AI usage to occupational tasks using a large-scale dataset from Bing Copilot.
The study employs a multi-stage LLM-based classification pipeline to quantify AI's augmentation versus automation across work activities.
Findings reveal significant impact in knowledge and communication roles while showing limited effects on manual and physical occupations.

Occupational Implications of Generative AI: An Empirical Analysis of Real-World Usage

This paper presents a comprehensive empirical paper of the occupational impact of generative AI, leveraging a large-scale dataset of 200,000 anonymized conversations between users and Microsoft Bing Copilot. The authors introduce a novel framework for mapping real-world AI usage to occupational work activities, providing a data-driven assessment of which tasks and occupations are most affected by generative AI systems. The analysis is grounded in the O*NET taxonomy, enabling systematic aggregation from granular work activities to occupational categories.

Methodological Contributions

The paper distinguishes between two axes of AI impact in human-AI interactions:

User Goals: The work activities users seek assistance with.
AI Actions: The work activities the AI system actually performs.

This distinction enables a nuanced analysis of augmentation (AI assisting users) versus automation (AI performing tasks directly). The authors employ a multi-stage LLM-based classification pipeline to map each conversation to relevant O*NET Intermediate Work Activities (IWAs), separately for user goals and AI actions. Task success is measured via explicit user feedback (thumbs up/down) and an LLM-based task completion classifier. The scope of AI impact is further quantified using a Likert-scale assessment of how much of an IWA is addressed in each interaction.

An AI applicability score is computed for each occupation, integrating:

The frequency with which its associated IWAs are observed in Copilot usage (activity share threshold: 0.05%).
Task completion rates.
Scope of impact (fraction of IWA covered at moderate or higher level).
Importance and relevance weights from O*NET for each occupation-IWA pair.

This score is designed for relative comparison across occupations, avoiding the pitfalls of threshold sensitivity that affect absolute coverage metrics.

Key Empirical Findings

Distribution of AI Usage Across Work Activities

Analysis of Copilot conversations reveals that the most common user goals are:

Gathering information.
Writing and editing content.
Communicating with others.

The most frequent AI actions are:

Providing information and assistance.
Teaching, coaching, and advising.

A notable finding is that in 40% of conversations, the sets of user goal IWAs and AI action IWAs are disjoint, highlighting the asymmetry between what users seek and what AI systems deliver.

Occupational Impact

Occupations with the highest AI applicability scores are concentrated in knowledge work and communication-intensive roles, including:

Interpreters and Translators.
Historians.
Sales Representatives.
Writers and Authors.
Customer Service Representatives.
Technical Writers, Editors, and Data Scientists.

Conversely, occupations with the lowest scores are those with substantial physical, manual, or machinery-related components, such as:

Nursing Assistants.
Plant and System Operators.
Automotive Technicians.
Construction and Extraction Workers.

All major occupational groups exhibit some degree of AI applicability, but the breadth and depth of impact are highly uneven.

Task Success and Scope

Work activities involving writing, editing, and information gathering receive the highest rates of positive user feedback and task completion. Activities related to data analysis and visual design are less successfully assisted or performed by Copilot. The scope of AI impact is broader for user assistance than for direct AI performance, indicating that current generative AI systems are more effective as augmentative tools than as full task automators.

Socioeconomic Correlates

The correlation between AI applicability and occupational wage is weak (employment-weighted $r = 0.07$ ), with only a slightly higher average applicability for occupations requiring a Bachelor's degree. High-employment, lower-wage occupations in sales and office support exhibit substantial AI applicability, challenging the narrative that generative AI primarily affects high-wage, high-education roles.

Alignment with Prior Predictions

The AI applicability scores derived from real-world usage data are strongly correlated with prior expert predictions of LLM impact on occupations (e.g., Eloundou et al., 2024), with $r = 0.73$ at the occupation level and $r = 0.91$ at the major group level. However, discrepancies exist for specialized or low-employment occupations, and for roles where the mapping between observed AI usage and occupational tasks is less direct.

Implications and Future Directions

Practical Implications

Workforce Planning: The results provide actionable insights for organizations and policymakers regarding which occupational groups are most exposed to generative AI, informing reskilling and workforce development strategies.
AI Deployment: The findings suggest that current generative AI systems are best positioned as augmentative tools for knowledge and communication work, rather than as full replacements for complex, physical, or highly specialized tasks.
Task Redesign: The observed asymmetry between user goals and AI actions underscores the need for careful task decomposition and workflow redesign when integrating AI into occupational settings.

Theoretical Implications

The paper empirically validates the task-based framework for analyzing technological impact on labor, demonstrating the utility of mapping real-world AI usage to standardized occupational taxonomies.
The weak correlation between AI applicability and wage/education challenges assumptions about the distributional effects of generative AI, suggesting a more complex pattern of exposure across the labor market.

Limitations

The analysis is limited to one mainstream LLM platform (Copilot) and U.S.-centric occupational data (O*NET), potentially underrepresenting certain domains or user populations.
The mapping from AI usage to occupational impact is indirect; downstream business and organizational responses are not observed.
The O*NET taxonomy, while comprehensive, may lag behind emerging work activities and new occupations created by AI adoption.

Future Research

Longitudinal Analysis: Tracking changes in AI applicability over time as generative AI capabilities evolve and adoption patterns shift.
Cross-Platform Comparison: Extending the analysis to other LLM platforms and international contexts to capture a broader spectrum of AI usage.
Emergence of New Occupations: Investigating how new work activities and job categories arise in response to generative AI, and how existing occupations are reconstituted.
Organizational Dynamics: Studying how firms restructure tasks and roles in response to AI augmentation and automation, including the creation of hybrid human-AI workflows.

Conclusion

This paper provides a rigorous, data-driven assessment of the occupational implications of generative AI, grounded in real-world usage data and standardized occupational taxonomies. The findings highlight the current frontier of AI applicability, with the strongest impact on knowledge and communication work, and limited reach into physical and manual occupations. The methodology and results offer a foundation for ongoing measurement of AI's evolving role in the labor market and for informed policy and organizational responses to technological change.

PDF Markdown

Follow-up Questions

Related Papers

Authors (5)

Tweets

https://twitter.com/PeterBerezinBCA/status/1951261667280441855

https://twitter.com/skdh/status/1951281005903413315

https://twitter.com/Awk20000/status/1950983061153448075

https://twitter.com/wandering_stoic/status/1948608097146339352

https://twitter.com/skdh/status/1951281167665148061

https://twitter.com/sundeep/status/1950599950984741198