Reliable suggestion of new scientific ideas and impact evaluation by large-language models
Determine how large-language models—specifically GPT-4, Gemini, and LLaMA-2—can reliably suggest new scientific ideas and evaluate the prospective impact of those ideas in the near term.
References
However, these models often struggle in scientific reasoning, and it remains unclear how they can suggest new scientific ideas or evaluate their impact in a reliable way in the near term.
— Forecasting high-impact research topics via machine learning on evolving knowledge graphs
(2402.08640 - Gu et al., 13 Feb 2024) in Introduction