AI on AI: Exploring the Utility of GPT as an Expert Annotator of AI Publications (2403.09097v1)
Abstract: Identifying scientific publications that are within a dynamic field of research often requires costly annotation by subject-matter experts. Resources like widely-accepted classification criteria or field taxonomies are unavailable for a domain like AI, which spans emerging topics and technologies. We address these challenges by inferring a functional definition of AI research from existing expert labels, and then evaluating state-of-the-art chatbot models on the task of expert data annotation. Using the arXiv publication database as ground-truth, we experiment with prompt engineering for GPT chatbot models to identify an alternative, automated expert annotation pipeline that assigns AI labels with 94% accuracy. For comparison, we fine-tune SPECTER, a transformer LLM pre-trained on scientific publications, that achieves 96% accuracy (only 2% higher than GPT) on classifying AI publications. Our results indicate that with effective prompt engineering, chatbots can be used as reliable data annotators even where subject-area expertise is required. To evaluate the utility of chatbot-annotated datasets on downstream classification tasks, we train a new classifier on GPT-labeled data and compare its performance to the arXiv-trained model. The classifier trained on GPT-labeled data outperforms the arXiv-trained model by nine percentage points, achieving 82% accuracy.
- Artificial intelligence (ai) vs. machine learning (ml).
- Ethem Alpaydin. 2016. Machine learning: the new AI. MIT press.
- Scibert: Pretrained language model for scientific text. In EMNLP.
- Stephen Cave and Seán S ÓhÉigeartaigh. 2018. An ai race for strategic advantage: rhetoric and risks. In Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society, pages 36–40.
- On the use of arxiv as a dataset. arXiv preprint arXiv:1905.00075.
- SPECTER: Document-level Representation Learning using Citation-informed Transformers. In ACL.
- On the origin of hallucinations in conversational models: Is it the datasets or the models? In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 5271–5285, Seattle, United States. Association for Computational Linguistics.
- Ethan Fast and Eric Horvitz. 2017. Long-term trends in the public perception of artificial intelligence. In Proceedings of the AAAI conference on artificial intelligence, volume 31.
- Bibliometric analysis on tendency and topics of artificial intelligence over last decade. Microsystem Technologies, 27:1545–1557.
- Chatgpt outperforms crowd workers for text-annotation tasks. Proceedings of the National Academy of Sciences, 120(30):e2305016120.
- When will ai exceed human performance? evidence from ai experts. Journal of Artificial Intelligence Research, 62:729–754.
- Christoph Gröger. 2021. There is no ai without data. Communications of the ACM, 64(11):98–108.
- A systematic method to create search strategies for emerging technologies based on the web of science: illustrated for ‘big data’. Scientometrics, 105:2005–2022.
- Survey of hallucination in natural language generation. ACM Computing Surveys, 55(12):1–38.
- Can chatgpt understand causal language in science claims? In Proceedings of the 13th Workshop on Computational Approaches to Subjectivity, Sentiment, & Social Media Analysis, pages 379–389.
- Defining ai in policy versus practice. In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, pages 72–78.
- Raymond Kurzweil. 1985. What is artificial intelligence anyway? as the techniques of computing grow more sophisticated, machines are beginning to appear intelligent—but can they actually think? American Scientist, 73(3):258–264.
- The facets of artificial intelligence: A framework to track the evolution of ai. In International Joint Conferences on Artificial Intelligence, pages 5180–5187.
- Kumiko Miyazaki and Ryusuke Sato. 2018. Analyses of the technological accumulation over the 2 nd and the 3 rd ai boom and the issues related to ai adoption by firms. In 2018 Portland International Conference on Management of Engineering and Technology (PICMET), pages 1–7. IEEE.
- Andrei Mogoutov and Bernard Kahane. 2007. Data search strategy for science and technology emergence: A scalable and evolutionary query for nanotechnology tracking. Research Policy, 36(6):893–903.
- Multi-label classification of research articles using word2vec and identification of similarity threshold. Scientific Reports, 11(1):21900.
- Global research on artificial intelligence from 1990–2014: Spatially-explicit bibliometric analysis. ISPRS International Journal of Geo-Information, 5(5):66.
- OpenAI. 2023. Gpt-4 technical report.
- Openalex: A fully-open index of scholarly works, authors, venues, institutions, and concepts. arXiv preprint arXiv:2205.01833.
- Stuart J Russell. 2010. Artificial intelligence a modern approach. Pearson Education, Inc.
- Ai for ai: Using ai methods for classifying ai science documents. Quantitative Science Studies, pages 1–14.
- Roger C Schank. 1987. What is ai, anyway? AI magazine, 8(4):59–59.
- A web-scale system for scientific knowledge exploration. In Proceedings of ACL 2018, System Demonstrations, pages 87–92, Melbourne, Australia. Association for Computational Linguistics.
- Engineering applications of artificial intelligence: A bibliometric analysis of 30 years (1988–2018). Engineering Applications of Artificial Intelligence, 85:517–532.
- An overview of microsoft academic service (mas) and applications. In Proceedings of the 24th International Conference on World Wide Web, WWW ’15 Companion, page 243–246, New York, NY, USA. Association for Computing Machinery.
- Arho Suominen and Nils C Newman. 2017. Exploring the fundamental conceptual units of technical emergence. In 2017 Portland International Conference on Management of Engineering and Technology (PICMET), pages 1–5. IEEE.
- Latanya Sweeney. That’s ai?: a history and critique of the field.
- Autumn Toney and James Dunham. 2022. Multi-label classification of scientific research documents across domains and languages. In Proceedings of the Third Workshop on Scholarly Document Processing, pages 105–114.
- Want to reduce labeling cost? GPT-3 can help. In Findings of the Association for Computational Linguistics: EMNLP 2021, pages 4195–4205, Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Darrell M West and John R Allen. 2018. How artificial intelligence is transforming the world. Report. April, 24:2018.
- Micheal Woolridge. 2022. A brief history of artificial intelligence: what it is, where we are, and where we are going. Flatiron Books.
- The ai index 2021 annual report. arXiv preprint arXiv:2103.06312.
- Autumn Toney-Wails (2 papers)
- Christian Schoeberl (1 paper)
- James Dunham (3 papers)