Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
131 tokens/sec
GPT-4o
10 tokens/sec
Gemini 2.5 Pro Pro
47 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

How should the advent of large language models affect the practice of science? (2312.03759v1)

Published 5 Dec 2023 in cs.CL, cs.AI, cs.CY, and cs.DL

Abstract: LLMs are being increasingly incorporated into scientific workflows. However, we have yet to fully grasp the implications of this integration. How should the advent of LLMs affect the practice of science? For this opinion piece, we have invited four diverse groups of scientists to reflect on this query, sharing their perspectives and engaging in debate. Schulz et al. make the argument that working with LLMs is not fundamentally different from working with human collaborators, while Bender et al. argue that LLMs are often misused and over-hyped, and that their limitations warrant a focus on more specialized, easily interpretable tools. Marelli et al. emphasize the importance of transparent attribution and responsible use of LLMs. Finally, Botvinick and Gershman advocate that humans should retain responsibility for determining the scientific roadmap. To facilitate the discussion, the four perspectives are complemented with a response from each group. By putting these different perspectives in conversation, we aim to bring attention to important considerations within the academic community regarding the adoption of LLMs and their impact on both current and future scientific practices.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (71)
  1. A neural probabilistic language model. \JournalTitleAdvances in neural information processing systems 13 (2000).
  2. Speech and language processing : an introduction to natural language processing, computational linguistics, and speech recognition (Pearson Prentice Hall, 2009).
  3. Brown, T. et al. Language models are few-shot learners. \JournalTitleAdvances in neural information processing systems 33, 1877–1901 (2020).
  4. Drori, I. et al. A neural network solves, explains, and generates university math problems by program synthesis and few-shot learning at human level. \JournalTitleProceedings of the National Academy of Sciences 119, e2123433119 (2022).
  5. Large language models are state-of-the-art evaluators of translation quality. In Nurminen, M. et al. (eds.) Proceedings of the 24th Annual Conference of the European Association for Machine Translation, 193–203 (European Association for Machine Translation, Tampere, Finland, 2023).
  6. Gpt-4 passes the bar exam. \JournalTitleAvailable at SSRN 4389233 (2023).
  7. Gpts are gpts: An early look at the labor market impact potential of large language models. \JournalTitlearXiv:2303.10130. Unpublished preprint (2023).
  8. Kasneci, E. et al. Chatgpt for good? on opportunities and challenges of large language models for education. \JournalTitleLearning and Individual Differences 103, 102274 (2023).
  9. On chatgpt and beyond: How generative artificial intelligence may affect research, teaching, and practice. \JournalTitleInternational Journal of Research in Marketing (2023).
  10. Chatting about chatgpt: how may ai and gpt impact academia and libraries? \JournalTitleLibrary Hi Tech News 40, 26–29 (2023).
  11. A chat (gpt) about the future of scientific publishing. \JournalTitleBrain Behav Immun 110, 152–154 (2023).
  12. Chatgpt in scientific writing: a cautionary tale. \JournalTitleThe American Journal of Medicine (2023).
  13. Lund, B. D. et al. Chatgpt and a new academic reality: Artificial intelligence-written research papers and the ethics of the large language models in scholarly publishing. \JournalTitleJournal of the Association for Information Science and Technology 74, 570–581 (2023).
  14. Can gpt-3 write an academic paper on itself, with minimal human input? \JournalTitleUnpublished (2022).
  15. Science in the age of large language models. \JournalTitleNature Reviews Physics 1–4 (2023).
  16. Friend or foe? exploring the implications of large language models on the science system. \JournalTitlearXiv:2306.09928. Unpublished preprint (2023).
  17. What chatgpt and generative ai mean for science, DOI: 10.1038/d41586-023-00340-6 (2023).
  18. Taylor, R. et al. Galactica: A large language model for science. \JournalTitlearXiv:2211.09085. Unpublished preprint (2022).
  19. https://unlocked.microsoft.com/ai-anthology/terence-tao/. Accessed: 2023-09-04.
  20. Heaven, W. D. Why Meta’s latest large language model survived only three days online (2022).
  21. On the dangers of stochastic parrots: Can language models be too big? In Proceedings of the 2021 ACM conference on fairness, accountability, and transparency, 610–623 (2021).
  22. Arkoudas, K. Gpt-4 can’t reason. \JournalTitlearXiv:2308.03762. Unpublished preprint (2023).
  23. Chatgpt outperforms crowd workers for text-annotation tasks. \JournalTitleProceedings of the National Academy of Sciences 120, e2305016120, DOI: 10.1073/pnas.2305016120 (2023). https://www.pnas.org/doi/pdf/10.1073/pnas.2305016120.
  24. Automated jingle–jangle detection: Using embeddings to tackle taxonomic incommensurability. \JournalTitlePsyArXiv DOI: https://doi.org/10.31234/osf.io/9h7aw (2023).
  25. Can ai language models replace human participants? \JournalTitleTrends in Cognitive Sciences (2023).
  26. Hutson, M. Guinea pigbots. \JournalTitleScience (New York, N.Y.) 381, 121–123, DOI: 10.1126/science.adj6791 (2023).
  27. Rozière, B. et al. Code llama: Open foundation models for code. \JournalTitlearXiv:2308.12950. Unpublished preprint (2023).
  28. Sanmarchi, F. et al. A step-by-step researcher’s guide to the use of an ai-based transformer in epidemiology: an exploratory analysis of chatgpt using the strobe checklist for observational studies. \JournalTitleJournal of Public Health 1–36 (2023).
  29. Dehouche, N. Plagiarism in the age of massive generative pre-trained transformers (gpt-3). \JournalTitleEthics in Science and Environmental Politics 21, 17–23 (2021).
  30. Towards understanding and mitigating social biases in language models. In International Conference on Machine Learning, 6565–6576 (PMLR, 2021).
  31. Coda-Forno, J. et al. Inducing anxiety in large language models increases exploration and bias. \JournalTitlearXiv:2304.11111. Unpublished preprint (2023).
  32. Hutchinson, B. et al. Social biases in NLP models as barriers for persons with disabilities. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 5491–5501, DOI: 10.18653/v1/2020.acl-main.487 (Association for Computational Linguistics, 2020).
  33. Carlini, N. et al. Extracting training data from large language models. In 30th USENIX Security Symposium (USENIX Security 21), 2633–2650 (2021).
  34. King, M. R. A place for large language models in scientific publishing, apart from credited authorship. \JournalTitleCellular and Molecular Bioengineering 1–4 (2023).
  35. Ai, write an essay for me: A large-scale comparison of human-written versus chatgpt-generated essays. \JournalTitlearXiv:2304.14276. Unpublished preprint (2023).
  36. Ai-assisted coding: Experiments with gpt-4. \JournalTitlearXiv:2304.13187. Unpublished preprint (2023).
  37. News summarization and evaluation in the era of gpt-3. \JournalTitlearXiv:2209.12356. Unpublished preprint (2022).
  38. Towards a transparent ai future: The call for less regulatory hurdles on open-source ai in europe. https://laion.ai/blog/transparent-ai/. Accessed: 2023-10-22.
  39. Liang, W. et al. Can large language models provide useful feedback on research papers? a large-scale empirical analysis. \JournalTitlearXiv:2310.01783. Unpublished preprint (2023).
  40. Climbing towards NLU: On meaning, form, and understanding in the age of data. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 5185–5198, DOI: 10.18653/v1/2020.acl-main.463 (Association for Computational Linguistics, Online, 2020).
  41. An overview of end-to-end automatic speech recognition. \JournalTitleSymmetry 11, DOI: 10.3390/sym11081018 (2019).
  42. On the automaticity of language processing. In Schmid, H.-J. (ed.) Entrenchment and the Psychology of Language Learning: How We Reorganize and Adapt Linguistic Knowledge, DOI: https://doi-org.offcampus.lib.washington.edu/10.1037/15969-010 (American Psychological Association; De Gruyter Mouton, 2017).
  43. Kinney, R. M. et al. The semantic scholar open data platform. \JournalTitleArXiv. Unpublished preprint abs/2301.10140 (2023).
  44. Ranking sentences for extractive summarization with reinforcement learning. In Walker, M., Ji, H. & Stent, A. (eds.) Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), 1747–1759, DOI: 10.18653/v1/N18-1158 (Association for Computational Linguistics, New Orleans, Louisiana, 2018).
  45. Response: Emergent analogical reasoning in large language models (2023). 2308.16118.
  46. Simulating social media using large language models to evaluate alternative news feed algorithms (2023). 2310.05984.
  47. Argyle, L. P. et al. Out of one, many: Using language models to simulate human samples. \JournalTitlePolitical Analysis 31, 337–351, DOI: 10.1017/pan.2023.2 (2023).
  48. Conroy, G. How ChatGPT and other AI tools could disrupt scientific publishing. \JournalTitleNature 622, 234–236 (2023).
  49. Laboratory life: The construction of scientific facts (Princeton university press, 2013).
  50. Toward a new economics of science. \JournalTitleResearch policy 23, 487–521 (1994).
  51. Amano, T. et al. The manifold costs of being a non-native english speaker in science. \JournalTitlePLoS Biology 21, e3002184 (2023).
  52. Jumper, J. et al. Highly accurate protein structure prediction with alphafold. \JournalTitleNature 596, 583–589 (2021).
  53. Vector-space models of semantic representation from a cognitive perspective: A discussion of common misconceptions. \JournalTitlePerspectives on Psychological Science 14, 1006–1033 (2019).
  54. Time travel in llms: Tracing data contamination in large language models. \JournalTitlearXiv:2308.08493. Unpublished preprint (2023).
  55. Li, R. et al. Starcoder: may the source be with you! \JournalTitlearXiv:2305.06161. Unpublished preprint (2023).
  56. Fabrication and errors in the bibliographic citations generated by chatgpt. \JournalTitleScientific Reports 13, 14045 (2023).
  57. Kocoń, J. et al. Chatgpt: Jack of all trades, master of none. \JournalTitleInformation Fusion 101861 (2023).
  58. Liu, H. et al. Evaluating the logical reasoning ability of chatgpt and gpt-4. \JournalTitlearXiv:2304.03439. Unpublished preprint (2023).
  59. Durmus, E. et al. Towards measuring the representation of subjective global opinions in language models. \JournalTitlearXiv:2306.16388. Unpublished preprint (2023).
  60. Which humans? \JournalTitlePsyArXiv. Unpublished preprint (2023).
  61. Santurkar, S. et al. Whose opinions do language models reflect? \JournalTitlearXiv:2303.17548. Unpublished preprint (2023).
  62. Flaherty, C. The peer review crisis. https://www.insidehighered.com/news/2022/06/13/peer-review-crisis-creates-problems-journals-and-scholars. Accessed: 2023-10-30.
  63. Papers and patents are becoming less disruptive over time. \JournalTitleNature 613, 138–144 (2023).
  64. Davies, A. et al. Advancing mathematics by guiding human intuition with ai. \JournalTitleNature 600, 70–74 (2021).
  65. The Challenge of Value Alignment: From Fairer Algorithms to AI Safety. In The Oxford Handbook of Digital Ethics, DOI: 10.1093/oxfordhb/9780198857815.013.18 (Oxford University Press). https://academic.oup.com/book/0/chapter/337809435/chapter-ag-pdf/50148600/book_37078_section_337809435.ag.pdf.
  66. Silver, D. et al. Mastering the game of go without human knowledge. \JournalTitlenature 550, 354–359 (2017).
  67. Rediscovering orbital mechanics with machine learning. \JournalTitleMachine Learning: Science and Technology 4, 045002 (2023).
  68. Botvinick, M. Have we lost our minds? https://medium.com/@matthew.botvinick/have-we-lost-our-minds-86d9125bd803. Accessed: 2023-10-30.
  69. Stiennon, N. et al. Learning to summarize with human feedback. \JournalTitleAdvances in Neural Information Processing Systems 33, 3008–3021 (2020).
  70. Longpre, S. et al. The flan collection: Designing data and methods for effective instruction tuning. In Proceedings of the 40th International Conference on Machine Learning, ICML’23 (JMLR.org, 2023).
  71. Taylor, F. W. The Principles of Scientific Management (Harper, 1913).
Citations (6)

Summary

We haven't generated a summary for this paper yet.