Scientific production in the era of Large Language Models

Published 19 Jan 2026 in cs.DL, cs.AI, cs.CY, and physics.soc-ph | (2601.13187v1)

Abstract: LLMs are rapidly reshaping scientific research. We analyze these changes in multiple, large-scale datasets with 2.1M preprints, 28K peer review reports, and 246M online accesses to scientific documents. We find: 1) scientists adopting LLMs to draft manuscripts demonstrate a large increase in paper production, ranging from 23.7-89.3% depending on scientific field and author background, 2) LLM use has reversed the relationship between writing complexity and paper quality, leading to an influx of manuscripts that are linguistically complex but substantively underwhelming, and 3) LLM adopters access and cite more diverse prior work, including books and younger, less-cited documents. These findings highlight a stunning shift in scientific production that will likely require a change in how journals, funding agencies, and tenure committees evaluate scientific works.

Abstract PDF Upgrade to Chat

Summary

The paper demonstrates that LLM adoption boosts manuscript output significantly, with increases reaching up to 89.3% in certain groups.
The paper reveals that LLM-assisted texts exhibit higher linguistic complexity that paradoxically correlates with lower peer review success.
The paper identifies that LLM usage diversifies citation practices by incorporating younger and less canonical literature into research.

Scientific Production in the Era of LLMs: A Detailed Synthesis

Overview

This study presents a quantitative macroanalysis of the transformation induced by LLMs in scientific research workflows. Leveraging comprehensive datasets spanning 2.1M preprints, 28K peer reviews, and 246M document accesses, the authors assess how LLM adoption affects scientific productivity, writing complexity, and citation behavior. By deploying a robust text-based LLM detection approach and sophisticated event-study methodology, the work delineates the heterogeneous impacts of LLMs across scientific domains and author backgrounds, and interrogates both the promise and emerging pitfalls attendant to a new era of algorithmically mediated scientific writing.

Methodological Approach

The study utilizes three of the largest preprint repositories: arXiv, bioRxiv, and SSRN, covering STEM, life sciences, and social sciences/humanities, respectively. Detection of LLM-usage is based on classifier contrasts between pre-ChatGPT human-authored text and post-ChatGPT LLM-rewritten text distributions within abstracts. Author-level event studies establish adoption points and allow pre- and post-adoption comparisons with non-adopter controls. Field- and time-fixed effects, as well as subgroup analyses based on inferred author linguistic background, further contextualize the results. Citation and access data are mapped via OpenAlex, Semantic Scholar, and server log analytics.

Impact on Scientific Productivity

The association between LLM adoption and manuscript output is pronounced, with statistically significant increases across all repositories: 36.2% (arXiv), 52.9% (bioRxiv), and 59.8% (SSRN) relative to matched controls. The productivity effects are especially acute for scholars with Asian names and institutional affiliations in Asia, where gains reach up to 89.3% (bioRxiv) and 88.9% (SSRN). Even among authors with Caucasian names in English-speaking countries, the increases remain robust at 23.7%–46.2%. The results are consistent with hypotheses that LLMs systematically lower the cost of manuscript composition, disproportionately benefiting non-native English speakers and mitigating historical linguistic inequities in scholarly communication.

Writing Complexity and Erosion of Quality Signals

A critical insight pertains to the decoupling of linguistic complexity and publication quality. LLM-written manuscripts, as detected, are linguistically more complex by established readability metrics (e.g., inverse Flesch scores, syllables/word, morphological markers) than their human-authored counterparts ( $P < 0.001$ ). However, whereas in prior literature and non-LLM-authored work, complexity positively correlates with acceptance and peer review scores, this signal is reversed in LLM-assisted manuscripts: increased complexity correlates with lower probability of successful peer-reviewed publication and lower reviewer ratings.

This inversion is robust across datasets, linguistic features, and alternative metrics of scientific quality (including ICLR 2024's full peer review corpus). Thus, the paper asserts that the historical utility of polished, complex prose as a heuristic for underlying scientific merit is collapsing under LLM proliferation. The authors warn that this shift poses material risks for the scientific enterprise, potentially diluting editorial and review workflows with superficial yet linguistically sophisticated, substantively weak submissions, and threatening the efficacy of long-standing quality assessment shortcuts.

Diversification of Citation and Literature Discovery

Analysis of both access and citation data reveals a notable broadening of the knowledge base among LLM adopters. Post-adoption, researchers are:

11.9% more likely to cite books, indicating integration of less-structured, often overlooked reference material.
Referencing documents that are on average 0.379 years younger, implying increased attention to recent scholarship.
Citing works with slightly lower aggregate citation counts, countering concerns of LLMs amplifying only canonical literature.

Causal inference is constrained by selection and measurement challenges, but difference-in-difference estimates and robustness checks (e.g., Bing Chat vs. Google access flows) support the claim that LLMs facilitate exposure to a wider, younger, and less canonically entrenched scientific corpus. This is attributed to the breadth and recency of knowledge accessible via LLMs and modern AI-powered search.

Theoretical and Practical Implications

The findings have several implications for the structure and operation of the scientific ecosystem:

Democratization: Reduced linguistic barriers could rebalance the locus of scientific output geographically and demographically.
Evaluation Crisis: The unreliability of linguistic complexity as a signal necessitates urgent innovation in editorial and peer-review practices, including the potential deployment of reviewer-assistant LLMs to surface methodological issues or verify claims.
Literature Landscape: Greater diversity in referenced materials may foster conceptual cross-pollination but also increases the challenge for researchers and evaluators to distinguish substantial from superficial novelty amid literature expansion.

Further, as LLMs mature—encompassing deeper reasoning, discipline-specific expertise, and integration with literature retrieval—the documented effects may intensify or alter, raising fundamental questions regarding credit assignment, intellectual labor, and the meaning of scientific originality.

Limitations and Future Directions

The study is observational and relies on probabilistic LLM-usage detection based on publication abstracts (not full text), with indeterminate attribution across multi-authored papers. Non-random adoption and possible self-selection bias (e.g., propensity for adoption among those already more prolific or resourceful) limit causal claims. The study constitutes a cross-section in a period of rapid technological evolution; subsequent LLMs may yield different or amplified effects.

Suggested research directions include:

Development of more precise tools for LLM usage detection and attribution,
Longitudinal tracking of quality and equity impacts as next-generation LLMs integrate into research,
Examination of disciplinary boundaries and the potential for LLMs to dissolve communication barriers across fields,
Investigation of alternative, possibly AI-mediated, peer-assessment protocols.

Conclusion

LLMs are catalyzing a fundamental transformation in the process, equity, and epistemology of scientific production. By accelerating manuscript output, enabling broader literature discovery, and shifting citation patterns, LLMs democratize and accelerate global scientific enterprise. However, they simultaneously destabilize traditional proxies for scientific merit, challenging institutional frameworks for quality assurance and intellectual evaluation. This emergent landscape compels institutional adaptation in editorial, evaluative, and funding bodies, alongside the development of new, rigorous mechanisms for maintaining scientific standards as AI becomes deeply embedded in the research lifecycle.

Markdown