Papers
Topics
Authors
Recent
Search
2000 character limit reached

Evaluating AI and Human Authorship Quality in Academic Writing through Physics Essays

Published 8 Mar 2024 in physics.ed-ph | (2403.05458v1)

Abstract: This study evaluates $n = 300$ short-form physics essay submissions, equally divided between student work submitted before the introduction of ChatGPT and those generated by OpenAI's GPT-4. In blinded evaluations conducted by five independent markers who were unaware of the origin of the essays, we observed no statistically significant differences in scores between essays authored by humans and those produced by AI (p-value $= 0.107$, $\alpha$ = 0.05). Additionally, when the markers subsequently attempted to identify the authorship of the essays on a 4-point Likert scale - from Definitely AI' toDefinitely Human' - their performance was only marginally better than random chance. This outcome not only underscores the convergence of AI and human authorship quality but also highlights the difficulty of discerning AI-generated content solely through human judgment. Furthermore, the effectiveness of five commercially available software tools for identifying essay authorship was evaluated. Among these, ZeroGPT was the most accurate, achieving a 98% accuracy rate and a precision score of 1.0 when its classifications were reduced to binary outcomes. This result is a source of potential optimism for maintaining assessment integrity. Finally, we propose that texts with $\leq 50\%$ AI-generated content should be considered the upper limit for classification as human-authored, a boundary inclusive of a future with ubiquitous AI assistance whilst also respecting human-authorship.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (24)
  1. Nicola Woolcock “ChatGPT Marks End of Homework at Alleyn’s School” In The Times, 2023 URL: https://www.thetimes.co.uk/article/chatgpt-marks-end-of-homework-at-alleyns-school-5w6cdk5xc
  2. “AI et al.: machines are about to change scientific publishing forever” In ACS Energy Letters 8.1 ACS Publications, 2023, pp. 878–880
  3. “ChatGPT for (finance) research: The Bananarama conjecture” In Finance Research Letters 53 Elsevier, 2023, pp. 103662
  4. Stuart Hargreaves “‘Words Are Flowing Out Like Endless Rain Into a Paper Cup’: ChatGPT & Law School Assessments” In The Chinese University of Hong Kong Faculty of Law Research Paper, 2023
  5. Tom Revell, Will Yeadon and Glenn Cahilly-Bretzin “ChatGPT versus Human Essayists: An Exploration of the Impact of Artificial Intelligence for Authorship and Academic Integrity in the Humanities”, PREPRINT (Version 1) available at Research Square, 2023 URL: https://doi.org/10.21203/rs.3.rs-3483059/v1
  6. “A Systematic Study and Comprehensive Evaluation of ChatGPT on Benchmark Datasets” In arXiv preprint arXiv:2305.18486, 2023
  7. “Gpt-4 technical report” In arXiv preprint arXiv:2303.08774, 2023
  8. “Superglue: A stickier benchmark for general-purpose language understanding systems” In Advances in neural information processing systems 32, 2019
  9. Manmeet Singh, Vaisakh SB and Neetiraj Malviya “Mind meets machine: Unravelling GPT-4’s cognitive psychology” In arXiv preprint arXiv:2303.11436, 2023
  10. Office Students (OfS) “Analysis of degree classifications over time: changes in graduate attainment from 2010-11 to 2020-21” Office for Students, Bristol, England, 2022
  11. “Navigating the jagged technological frontier: Field experimental evidence of the effects of AI on knowledge worker productivity and quality” In Harvard Business School Technology & Operations Mgt. Unit Working Paper, 2023
  12. “ChatGPT and the frustrated Socrates” In Physics Education 58.3 IOP Publishing, 2023, pp. 035021
  13. Will Yeadon and Douglas P Halliday “Exploring durham university physics exams with large language models” In arXiv preprint arXiv:2306.15609, 2023
  14. “Deepfake text detection: Limitations and opportunities” In 2023 IEEE Symposium on Security and Privacy (SP), 2023, pp. 1613–1630 IEEE
  15. “Detectgpt: Zero-shot machine-generated text detection using probability curvature” In arXiv preprint arXiv:2301.11305, 2023
  16. “Can ai-generated text be reliably detected?” In arXiv preprint arXiv:2303.11156, 2023
  17. “GPT detectors are biased against non-native English writers” In arXiv preprint arXiv:2304.02819, 2023
  18. Bethan Staton “Universities Express Doubt over Tool to Detect AI-Powered Plagiarism” Accessed: 2024-02-24 In Financial Times, 2023 URL: www.ft.com/content/d872d65d-dfd0-40b3-8db9-a17fea20c60c
  19. Xiaomeng Hu, Pin-Yu Chen and Tsung-Yi Ho “Radar: Robust ai-text detection via adversarial learning” In Advances in Neural Information Processing Systems 36, 2024
  20. “Performance of ChatGPT on the test of understanding graphs in kinematics” In Physical Review Physics Education Research 20.1 APS, 2024, pp. 010109
  21. “ChatGPT-4 and the satisfied Socrates” In arXiv preprint arXiv:2401.11987, 2024
  22. “How understanding large language models can inform the use of ChatGPT in physics education” In European Journal of Physics 45.2 IOP Publishing, 2024, pp. 025701
  23. “The hype cycle model: A review and future directions” In Technological Forecasting and Social Change 108 Elsevier, 2016, pp. 28–41
  24. Jackson P Davis and Watt A Price “Deep learning for teaching university physics to computers” In American Journal of Physics 85.4 American Association of Physics Teachers, 2017, pp. 311–312
Citations (4)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.