An Ensemble Approach for Research Article Identification: a Case Study in Artificial Intelligence
Abstract: This study presents an ensemble approach that addresses the challenges of identification and analysis of research articles in rapidly evolving fields, using the field of AI as a case study. Our approach included using decision tree, sciBERT and regular expression matching on different fields of the articles, and a SVM to merge the results from different models. We evaluated the effectiveness of our method on a manually labeled dataset, finding that our combined approach captured around 97% of AI-related articles in the web of science (WoS) corpus with a precision of 0.92. This presents a 0.15 increase in F1 score compared with existing search term based approach. Following this, we analyzed the publication volume trends and common research themes.We found that compared with existing methods, our ensemble approach revealed an increased degree of interdisciplinarity, and was able to identify more articles in certain subfields like feature extraction and optimization. This study demonstrates the potential of our approach as a tool for the accurate identification of scholarly articles, which is also capable of providing insights into the volume and content of a research area.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.