Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

LitFM: A Retrieval Augmented Structure-aware Foundation Model For Citation Graphs (2409.12177v1)

Published 5 Sep 2024 in cs.SI and cs.DL

Abstract: With the advent of LLMs, managing scientific literature via LLMs has become a promising direction of research. However, existing approaches often overlook the rich structural and semantic relevance among scientific literature, limiting their ability to discern the relationships between pieces of scientific knowledge, and suffer from various types of hallucinations. These methods also focus narrowly on individual downstream tasks, limiting their applicability across use cases. Here we propose LitFM, the first literature foundation model designed for a wide variety of practical downstream tasks on domain-specific literature, with a focus on citation information. At its core, LitFM contains a novel graph retriever to integrate graph structure by navigating citation graphs and extracting relevant literature, thereby enhancing model reliability. LitFM also leverages a knowledge-infused LLM, fine-tuned through a well-developed instruction paradigm. It enables LitFM to extract domain-specific knowledge from literature and reason relationships among them. By integrating citation graphs during both training and inference, LitFM can generalize to unseen papers and accurately assess their relevance within existing literature. Additionally, we introduce new large-scale literature citation benchmark datasets on three academic fields, featuring sentence-level citation information and local context. Extensive experiments validate the superiority of LitFM, achieving 28.1% improvement on retrieval task in precision, and an average improvement of 7.52% over state-of-the-art across six downstream literature-related tasks

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (9)
  1. Jiasheng Zhang (20 papers)
  2. Jialin Chen (39 papers)
  3. Ali Maatouk (35 papers)
  4. Ngoc Bui (12 papers)
  5. Qianqian Xie (60 papers)
  6. Leandros Tassiulas (89 papers)
  7. Jie Shao (53 papers)
  8. Hua Xu (78 papers)
  9. Rex Ying (90 papers)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com