Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Oil & Water? Diffusion of AI Within and Across Scientific Fields (2405.15828v1)

Published 24 May 2024 in cs.DL and cs.AI

Abstract: This study empirically investigates claims of the increasing ubiquity of AI within roughly 80 million research publications across 20 diverse scientific fields, by examining the change in scholarly engagement with AI from 1985 through 2022. We observe exponential growth, with AI-engaged publications increasing approximately thirteenfold (13x) across all fields, suggesting a dramatic shift from niche to mainstream. Moreover, we provide the first empirical examination of the distribution of AI-engaged publications across publication venues within individual fields, with results that reveal a broadening of AI engagement within disciplines. While this broadening engagement suggests a move toward greater disciplinary integration in every field, increased ubiquity is associated with a semantic tension between AI-engaged research and more traditional disciplinary research. Through an analysis of tens of millions of document embeddings, we observe a complex interplay between AI-engaged and non-AI-engaged research within and across fields, suggesting that increasing ubiquity is something of an oil-and-water phenomenon -- AI-engaged work is spreading out over fields, but not mixing well with non-AI-engaged work.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (42)
  1. Human languages with greater information density have higher communication speed but lower conversation breadth. Nature Human Behaviour, pages 1–13, 2024.
  2. Mapping the backbone of science. Scientometrics, 64:351–374, 2005.
  3. Ben Blaiszik. blaiszik/ml_publication_charts: AI/ML Publication Statistics for 2022, March 2023. https://doi.org/10.5281/zenodo.7713954.
  4. SciBERT: A pretrained language model for scientific text. In EMNLP. Association for Computational Linguistics, 2019.
  5. Specter: Document-level representation learning using citation-informed transformers. arXiv preprint arXiv:2004.07180, 2020.
  6. Generating long sequences with sparse transformers. arXiv preprint arXiv:1904.10509, 2019.
  7. Efficient and secure transfer, synchronization, and sharing of big data. IEEE Cloud Computing, 1(3):46–55, 2014.
  8. Making a mind vs. modeling the brain: AI back at a branchpoint. An International Journal of Computing and Informatics, 19:425–441, 1995.
  9. Apriori knowledge in an era of computational opacity: The role of ai in mathematical discovery. arXiv preprint arXiv:2403.15437, 2024.
  10. The social abduction of science. arXiv preprint arXiv:2111.13251, 2021.
  11. Navigating the jagged technological frontier: Field experimental evidence of the effects of AI on knowledge worker productivity and quality. Harvard Business School Technology & Operations Mgt. Unit Working Paper, 2023.
  12. The humanistic case for AI optimism. Poetics Today, Forthcoming.
  13. Being together in place as a catalyst for scientific advance. Research Policy, 53(2):104911, 2024.
  14. Eamon Duede. Instruments, agents, and artificial intelligence: Novel epistemic categories of reliability. Synthese, 200(6):491, 2022.
  15. Advancing mathematics by guiding human intuition with AI. Nature, 600(7887):70–74, 2021.
  16. Edward A Feigenbaum and Julian Feldman, editors. Computers and Thought, volume 7. New York McGraw-Hill, 1963.
  17. The evolution of citation graphs in artificial intelligence research. Nature Machine Intelligence, 1(2):79–85, 2019.
  18. Design of efficient molecular organic light-emitting diodes by a high-throughput virtual screening and experimental approach. Nature Materials, 15(10):1120–1127, 2016.
  19. Solving high-dimensional partial differential equations using deep learning. Proceedings of the National Academy of Sciences, 115(34):8505–8510, 2018.
  20. Discovering physical concepts with neural networks. Physical Review Letters, 124(1):010508, 2020.
  21. Highly accurate protein structure prediction with AlphaFold. Nature, 596(7873):583–589, 2021.
  22. Bag of tricks for efficient text classification. In Mirella Lapata, Phil Blunsom, and Alexander Koller, editors, Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, pages 427–431, Valencia, Spain, April 2017. Association for Computational Linguistics.
  23. The Semantic Scholar open data platform. arXiv preprint arXiv:2301.10140, 2023.
  24. Forecasting the future of artificial intelligence with machine learning-based link prediction in an exponentially growing knowledge network. Nature Machine Intelligence, 5(11):1326–1335, 2023.
  25. Reduced, reused and recycled: The life of a dataset in machine learning research. arXiv preprint arXiv:2112.01716, 2021.
  26. Deep learning, deep change? Mapping the evolution and geography of a general purpose technology. Scientometrics, 126:5589–5621, 2021.
  27. The geometry of culture: Analyzing the meanings of class through word embeddings. American Sociological Review, 84(5):905–949, 2019.
  28. Local similarity and global variability characterize the semantic space of human languages. Proceedings of the National Academy of Sciences, 120(51):e2300986120, 2023.
  29. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781, 2013.
  30. Robert K Merton. The sociology of science: Theoretical and empirical investigations. University of Chicago press, 1973.
  31. Melanie Mitchell. Artificial Intelligence: A Guide for Thinking Humans. Penguin UK, 2019.
  32. The generalized beta distribution as a model for the distribution of income: Estimation of related measures of inequality. Modeling Income Distributions and Lorenz Curves, pages 147–166, 2008.
  33. Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35:27730–27744, 2022.
  34. Maps of random walks on complex networks reveal community structure. Proceedings of the national academy of sciences, 105(4):1118–1123, 2008.
  35. Artificial Intelligence a Modern Approach. London, 2010.
  36. Mathematical discoveries from program search with large language models. Nature, pages 1–3, 2023.
  37. Direct preference optimization: Your language model is secretly a reward model. arXiv preprint arXiv:2305.18290, 2023.
  38. Surprising combinations of research contents and contexts are related to impact and emerge with scientific outsiders from distant disciplines. Nature Communications, 14(1):1641, 2023.
  39. Unsupervised word embeddings capture latent knowledge from materials science literature. Nature, 571(7763):95–98, 2019.
  40. The effect of in-person conferences on the diffusion of ideas. arXiv preprint arXiv:2209.01175, 2022.
  41. Attention is all you need. Advances in Neural Information Processing Systems, 30, 2017.
  42. The future of fundamental science led by generative closed-loop artificial intelligence. arXiv preprint arXiv:2307.07522, 2023.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Eamon Duede (16 papers)
  2. William Dolan (1 paper)
  3. André Bauer (11 papers)
  4. Ian Foster (138 papers)
  5. Karim Lakhani (3 papers)
Citations (2)

Summary

Diffusion of AI Within and Across Scientific Fields: Evaluating Ubiquity and Semantic Tension

The paper "Oil content Water? Diffusion of AI Within and Across Scientific Fields" presents an empirical investigation into the increasing ubiquity of AI across a diverse array of scientific fields over the period from 1985 to 2022. Using an extensive dataset comprising approximately 80 million research publications across 20 distinct disciplines, the paper scrutinizes the exponential growth in AI engagement, providing a nuanced understanding of AI's integration into the scientific landscape.

The authors approach the assessment of AI's ubiquity through three primary research questions. The first research question examines the temporal evolution of AI engagement percentage in scholarly research. Here, the findings reveal a 1293% increase in AI-engaged papers between 1985 and 2022, with nearly 9% of all papers being AI-engaged by the latter year. Such growth underscores AI's transition from a niche specialization to a pervasive element in research across multiple domains.

Addressing the second research question, the paper evaluates the extent of AI diffusion within individual fields by analyzing the distribution of AI-engaged publications across various publication venues. Utilizing the Gini coefficient to measure this distribution, the authors determine an increase in 'Ubiquity,' configuring it as the inverse of the Gini coefficient to more intuitively reflect diffusion. The results indicate that AI engagement is spreading across publication venues rather than being concentrated, indicating the substantive broadening of AI influence within disciplines.

The third research question explores changes in the semantic nature of AI-engaged and non-AI-engaged research as AI becomes more ubiquitous. Through document embeddings examined using SPECTER2 models, the paper assesses how AI engagement influences the themes and topics within fields. Notably, it finds that as AI ubiquity increases, AI-engaged research within fields becomes more semantically aligned with AI research in Computer Science, yet simultaneously diverges from non-AI-engaged work within the same field. This finding suggests a semantic tension between traditional research and emergent AI paradigms, implying that while AI knowledge is infiltrating various disciplines, it is not harmonizing effortlessly with existing scholarly work.

The implications of these findings are multifaceted. Practically, this paper highlights the widespread penetration of AI as an interdisciplinary tool, suggesting that stakeholders in academia and industry should foster integrative frameworks that accommodate such technological diffusion. Theoretically, it prompts reconsideration of established models of technological diffusion, as the observed 'oil and water' dynamic points towards a nuanced interplay between new and traditional research paradigms.

Future developments in AI research and application could entail further diversification of AI methodologies tailored to distinct disciplinary needs. There is potential for AI to catalyze interdisciplinary collaborations, driving innovation at the intersections of fields. As AI continues to reshape the landscapes it touches, careful attention must be paid to fostering integrative approaches that balance novel AI methodologies with the well-established practices of traditional fields.

In summary, the paper provides a detailed empirical account of AI's pervasive growth within scientific domains, underscored by rising engagement and ubiquity. The paper articulates the challenges inherent to this growth, particularly the integration of AI into diverse domains while managing semantic tensions with non-AI-engaged research—a task that stands as a fertile ground for further inquiry and strategy in the field of AI's academic and practical applications.

Youtube Logo Streamline Icon: https://streamlinehq.com

HackerNews