Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Is ChatGPT Transforming Academics' Writing Style? (2404.08627v2)

Published 12 Apr 2024 in cs.CL, cs.AI, cs.DL, and cs.LG

Abstract: Based on one million arXiv papers submitted from May 2018 to January 2024, we assess the textual density of ChatGPT's writing style in their abstracts through a statistical analysis of word frequency changes. Our model is calibrated and validated on a mixture of real abstracts and ChatGPT-modified abstracts (simulated data) after a careful noise analysis. The words used for estimation are not fixed but adaptive, including those with decreasing frequency. We find that LLMs, represented by ChatGPT, are having an increasing impact on arXiv abstracts, especially in the field of computer science, where the fraction of LLM-style abstracts is estimated to be approximately 35%, if we take the responses of GPT-3.5 to one simple prompt, "revise the following sentences", as a baseline. We conclude with an analysis of both positive and negative aspects of the penetration of LLMs into academics' writing style.

Assessing ChatGPT's Influence on Academic Writing Through arXiv Abstracts

Introduction

The infiltration of ChatGPT in academic writing has become a focal point of investigation due to its burgeoning usage across various fields. Mingmeng Geng and Roberto Trotta explore this domain by analyzing the textual transformation in one million arXiv abstracts from May 2018 to January 2024. Their paper leverages statistical analysis to discern word frequency changes, attributing these shifts to the growing integration of ChatGPT in the drafting and revision of academic abstracts. Primarily, the work sheds light on ChatGPT's prominence in the computer science domain, estimating a 35% revision rate in abstracts which could be correlated with ChatGPT usage based on the simplest interaction prompts with the AI.

Methodology

The novel approach of the paper differentiates between direct and indirect ChatGPT influence on academic writing. Direct application involves utilizing ChatGPT for generating or editing abstracts, while indirect influence captures the adaptation of ChatGPT's writing style by the authors themselves. The dataset comprises one million arXiv articles analyzed to capture the temporal evolution of word frequencies, further supported by a comparative analysis using the Google Ngram dataset. The paper operationalizes the notion of 'ChatGPT style' text through a robust statistical framework that accounts for both the direct application of ChatGPT in abstract creation and the subtler stylistic shifts in academic writing influenced by frequent interaction with the AI model.

Observations and Analysis

Initial findings pinpoint a significant alteration in the frequency of non-specialized words post-2023, a trend inconsistent with the conventional dynamics of academic writing but indicative of ChatGPT's stylistic influence. Notably, the decreased usage of basic words like "are" and "is" post-2023 further corroborates the AI's impact. These linguistic shifts serve as a statistical signature of ChatGPT's growing footprint in academic writing, particularly within the computer science domain.

Statistical Modeling of ChatGPT Impact

The paper institutes a quantitative framework for modeling ChatGPT's impact, emphasizing the relative application rate of ChatGPT across different academic disciplines. Through a meticulous statistical analysis, the researchers map out a substantive increase in ChatGPT-influenced abstracts. This modeling extends to discerning the potential overrepresentation of ChatGPT's stylistic tendencies compared to traditional academic writing norms, thereby providing a nuanced comprehension of ChatGPT's permeation into academic discourse.

Practical Implications and Future Directions

The implications of ChatGPT's integration into academic writing are multifaceted, encompassing both enhancements in writing efficiency and potential shifts in stylistic norms. The paper's findings prompt a broader discussion on the balance between automated assistance and original scholarly expression, raising pertinent questions about the future trajectory of academic writing in the AI era. Furthermore, the research paves the way for future investigations into the differential impacts of ChatGPT across various scientific fields, potentially guiding the development of tailored AI tools that respect disciplinary idiosyncrasies while augmenting writing proficiency.

Conclusion

Geng and Trotta's paper provides a vital empirical basis for understanding ChatGPT's influence on academic writing, particularly highlighting its significant adoption within computer science. The analytical methodologies employed offer a replicable template for assessing AI's role in academic discourse evolution, establishing a groundwork for ongoing scrutiny as AI tools become increasingly woven into the fabric of academic writing practices.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (26)
  1. The impact of large language models on scientific discovery: a preliminary study using gpt-4. arXiv preprint arXiv:2311.07361, 2023.
  2. Do artificial intelligence chatbots have a writing style? an investigation into the stylistic features of chatgpt-4. Journal of Artificial intelligence and technology, 3(3):85–94, 2023.
  3. The manifold costs of being a non-native english speaker in science. PLoS Biology, 21(7):e3002184, 2023.
  4. arXiv.org submitters. arxiv dataset, 2024. URL https://www.kaggle.com/dsv/7352739.
  5. Can linguists distinguish between chatgpt/ai and human writing?: A study of research ethics and academic publishing. Research Methods in Applied Linguistics, 2(3):100068, 2023.
  6. Comparisons of quality, correctness, and similarity between chatgpt-generated and human-written abstracts for basic research: Cross-sectional study. Journal of Medical Internet Research, 25:e51229, 2023.
  7. Evaluating academic answers generated using chatgpt. Journal of Chemical Education, 100(4):1672–1675, 2023.
  8. Fitria, T. N. Grammarly as ai-powered english writing assistant: Students’ alternative for writing english. Metathesis: Journal of English Language, Literature, and Teaching, 5(1):65–78, 2021.
  9. Comparing scientific abstracts generated by chatgpt to real abstracts with detectors and blinded human reviewers. NPJ Digital Medicine, 6(1):75, 2023.
  10. Is chatgpt a “fire of prometheus” for non-native english-speaking researchers in academic writing? Korean Journal of Radiology, 24(10):952, 2023.
  11. How ai-based training affected the performance of professional go players. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems, pp.  1–12, 2022.
  12. Chatgpt for good? on opportunities and challenges of large language models for education. Learning and individual differences, 103:102274, 2023.
  13. A watermark for large language models. In International Conference on Machine Learning, pp.  17061–17084. PMLR, 2023.
  14. Monitoring ai-modified content at scale: A case study on the impact of chatgpt on ai conference peer reviews. arXiv preprint arXiv:2403.07183, 2024a.
  15. Mapping the increasing use of llms in scientific papers. arXiv preprint arXiv:2404.01268, 2024b.
  16. Will chatgpt’s free language editing service level the playing field in science communication?: Insights from a collaborative project with non-native english scholars. Perspectives on Medical Education, 12(1):565, 2023.
  17. Chatgpt and a new academic reality: Artificial intelligence-written research papers and the ethics of the large language models in scholarly publishing. Journal of the Association for Information Science and Technology, 74(5):570–581, 2023.
  18. Chatgpt as a factual inconsistency evaluator for abstractive text summarization. arXiv preprint arXiv:2303.15621, 2023.
  19. Quantitative analysis of culture using millions of digitized books. science, 331(6014):176–182, 2011.
  20. A double-edged sword: the merits and the policy implications of google translate in higher education. European Journal of Higher Education, 6(4):387–401, 2016.
  21. Experimental evidence on the productivity effects of generative artificial intelligence. Science, 381(6654):187–192, 2023.
  22. Extracting accurate materials data from research papers with conversational language models and prompt engineering–example of chatgpt. arXiv preprint arXiv:2303.05352, 2023.
  23. Mastering the game of go without human knowledge. nature, 550(7676):354–359, 2017.
  24. The impact of alphafold protein structure database on the fields of life sciences. Proteomics, 23(17):2200128, 2023.
  25. Artificial intelligence in studies—use of chatgpt and ai-based tools among students in germany. Humanities and Social Sciences Communications, 10(1):1–9, 2023.
  26. A survey of large language models. arXiv preprint arXiv:2303.18223, 2023.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Mingmeng Geng (7 papers)
  2. Roberto Trotta (51 papers)
Citations (8)
Youtube Logo Streamline Icon: https://streamlinehq.com