Oil & Water? Diffusion of AI Within and Across Scientific Fields (2405.15828v1)
Abstract: This study empirically investigates claims of the increasing ubiquity of AI within roughly 80 million research publications across 20 diverse scientific fields, by examining the change in scholarly engagement with AI from 1985 through 2022. We observe exponential growth, with AI-engaged publications increasing approximately thirteenfold (13x) across all fields, suggesting a dramatic shift from niche to mainstream. Moreover, we provide the first empirical examination of the distribution of AI-engaged publications across publication venues within individual fields, with results that reveal a broadening of AI engagement within disciplines. While this broadening engagement suggests a move toward greater disciplinary integration in every field, increased ubiquity is associated with a semantic tension between AI-engaged research and more traditional disciplinary research. Through an analysis of tens of millions of document embeddings, we observe a complex interplay between AI-engaged and non-AI-engaged research within and across fields, suggesting that increasing ubiquity is something of an oil-and-water phenomenon -- AI-engaged work is spreading out over fields, but not mixing well with non-AI-engaged work.
- Human languages with greater information density have higher communication speed but lower conversation breadth. Nature Human Behaviour, pages 1–13, 2024.
- Mapping the backbone of science. Scientometrics, 64:351–374, 2005.
- Ben Blaiszik. blaiszik/ml_publication_charts: AI/ML Publication Statistics for 2022, March 2023. https://doi.org/10.5281/zenodo.7713954.
- SciBERT: A pretrained language model for scientific text. In EMNLP. Association for Computational Linguistics, 2019.
- Specter: Document-level representation learning using citation-informed transformers. arXiv preprint arXiv:2004.07180, 2020.
- Generating long sequences with sparse transformers. arXiv preprint arXiv:1904.10509, 2019.
- Efficient and secure transfer, synchronization, and sharing of big data. IEEE Cloud Computing, 1(3):46–55, 2014.
- Making a mind vs. modeling the brain: AI back at a branchpoint. An International Journal of Computing and Informatics, 19:425–441, 1995.
- Apriori knowledge in an era of computational opacity: The role of ai in mathematical discovery. arXiv preprint arXiv:2403.15437, 2024.
- The social abduction of science. arXiv preprint arXiv:2111.13251, 2021.
- Navigating the jagged technological frontier: Field experimental evidence of the effects of AI on knowledge worker productivity and quality. Harvard Business School Technology & Operations Mgt. Unit Working Paper, 2023.
- The humanistic case for AI optimism. Poetics Today, Forthcoming.
- Being together in place as a catalyst for scientific advance. Research Policy, 53(2):104911, 2024.
- Eamon Duede. Instruments, agents, and artificial intelligence: Novel epistemic categories of reliability. Synthese, 200(6):491, 2022.
- Advancing mathematics by guiding human intuition with AI. Nature, 600(7887):70–74, 2021.
- Edward A Feigenbaum and Julian Feldman, editors. Computers and Thought, volume 7. New York McGraw-Hill, 1963.
- The evolution of citation graphs in artificial intelligence research. Nature Machine Intelligence, 1(2):79–85, 2019.
- Design of efficient molecular organic light-emitting diodes by a high-throughput virtual screening and experimental approach. Nature Materials, 15(10):1120–1127, 2016.
- Solving high-dimensional partial differential equations using deep learning. Proceedings of the National Academy of Sciences, 115(34):8505–8510, 2018.
- Discovering physical concepts with neural networks. Physical Review Letters, 124(1):010508, 2020.
- Highly accurate protein structure prediction with AlphaFold. Nature, 596(7873):583–589, 2021.
- Bag of tricks for efficient text classification. In Mirella Lapata, Phil Blunsom, and Alexander Koller, editors, Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, pages 427–431, Valencia, Spain, April 2017. Association for Computational Linguistics.
- The Semantic Scholar open data platform. arXiv preprint arXiv:2301.10140, 2023.
- Forecasting the future of artificial intelligence with machine learning-based link prediction in an exponentially growing knowledge network. Nature Machine Intelligence, 5(11):1326–1335, 2023.
- Reduced, reused and recycled: The life of a dataset in machine learning research. arXiv preprint arXiv:2112.01716, 2021.
- Deep learning, deep change? Mapping the evolution and geography of a general purpose technology. Scientometrics, 126:5589–5621, 2021.
- The geometry of culture: Analyzing the meanings of class through word embeddings. American Sociological Review, 84(5):905–949, 2019.
- Local similarity and global variability characterize the semantic space of human languages. Proceedings of the National Academy of Sciences, 120(51):e2300986120, 2023.
- Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781, 2013.
- Robert K Merton. The sociology of science: Theoretical and empirical investigations. University of Chicago press, 1973.
- Melanie Mitchell. Artificial Intelligence: A Guide for Thinking Humans. Penguin UK, 2019.
- The generalized beta distribution as a model for the distribution of income: Estimation of related measures of inequality. Modeling Income Distributions and Lorenz Curves, pages 147–166, 2008.
- Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35:27730–27744, 2022.
- Maps of random walks on complex networks reveal community structure. Proceedings of the national academy of sciences, 105(4):1118–1123, 2008.
- Artificial Intelligence a Modern Approach. London, 2010.
- Mathematical discoveries from program search with large language models. Nature, pages 1–3, 2023.
- Direct preference optimization: Your language model is secretly a reward model. arXiv preprint arXiv:2305.18290, 2023.
- Surprising combinations of research contents and contexts are related to impact and emerge with scientific outsiders from distant disciplines. Nature Communications, 14(1):1641, 2023.
- Unsupervised word embeddings capture latent knowledge from materials science literature. Nature, 571(7763):95–98, 2019.
- The effect of in-person conferences on the diffusion of ideas. arXiv preprint arXiv:2209.01175, 2022.
- Attention is all you need. Advances in Neural Information Processing Systems, 30, 2017.
- The future of fundamental science led by generative closed-loop artificial intelligence. arXiv preprint arXiv:2307.07522, 2023.
- Eamon Duede (16 papers)
- William Dolan (1 paper)
- André Bauer (11 papers)
- Ian Foster (138 papers)
- Karim Lakhani (3 papers)