SurveyX: Academic Survey Automation via Large Language Models (2502.14776v2)

Published 20 Feb 2025 in cs.CL

Abstract: LLMs have demonstrated exceptional comprehension capabilities and a vast knowledge base, suggesting that LLMs can serve as efficient tools for automated survey generation. However, recent research related to automated survey generation remains constrained by some critical limitations like finite context window, lack of in-depth content discussion, and absence of systematic evaluation frameworks. Inspired by human writing processes, we propose SurveyX, an efficient and organized system for automated survey generation that decomposes the survey composing process into two phases: the Preparation and Generation phases. By innovatively introducing online reference retrieval, a pre-processing method called AttributeTree, and a re-polishing process, SurveyX significantly enhances the efficacy of survey composition. Experimental evaluation results show that SurveyX outperforms existing automated survey generation systems in content quality (0.259 improvement) and citation quality (1.76 enhancement), approaching human expert performance across multiple evaluation dimensions. Examples of surveys generated by SurveyX are available on www.surveyx.cn

Summary

The paper introduces SurveyX, a novel system automating academic survey generation using large language models through a two-phase process involving keyword expansion, AttributeTree, outline optimization, and RAG-based rewriting.
Evaluations demonstrate that SurveyX significantly improves content quality by 0.259 and citation quality by 1.76 over existing automated survey generation systems.
Ablation studies confirm the critical contributions of the AttributeTree method to structure and the RAG-based rewriting module to citation metrics within the SurveyX system.

This paper introduces SurveyX, an automated system designed to generate academic surveys by leveraging LLMs.

SurveyX employs a two-phase process involving a Preparation Phase using a keyword expansion algorithm and AttributeTree for enhanced information density, followed by a Generation Phase incorporating outline optimization and a RAG-based rewriting module.
The system's performance was evaluated across content quality, citation quality, and reference relevance, demonstrating a content quality improvement of 0.259 and a citation quality enhancement of 1.76 over existing automated survey generation systems.
Ablation studies validate the contribution of each module, with the AttributeTree method significantly impacting structure (4.91 to 4.08) and the RAG-based rewriting module substantially affecting citation metrics (Recall dropping from 85.23 to 55.37).

Here are the variables used in the LaTeX formulas:

$k^*$ = the most appropriate keyword to add to the keyword pool $k_c$ = keywords for each cluster $K_C$ = the set of keywords for each cluster $K_{pool}$ = the existing keyword pool $E(\cdot)$ = the embedding model $cos\_sim$ = cosine similarity $R_1$ = ranking calculation method 1 $R_2$ = ranking calculation method 2 $|K_{pool}|$ = the number of keywords in the keyword pool $Doc_{human}$ = references retrieved by humans $Doc_{machine}$ = references retrieved by machines $d$ = individual document $topic$ = survey topic $\mathbb{I}_{\text{relevant}}$ = indicator function which evaluates to 1 if LLM determines reference is relevant to the topic, otherwise 0 $LLM(Prompt(d,topic))$ = LLM's assessment of relevance given a prompt containing document $d$ and topic

PDF Markdown

Related Papers

Find Related Papers

Tweets

https://twitter.com/AymericRoucher/status/1893994028934254671

https://twitter.com/TheTuringPost/status/1894346312545702090