Bridging Research and Readers: A Multi-Modal Automated Academic Papers Interpretation System (2401.09150v1)

Published 17 Jan 2024 in cs.CL

Abstract: In the contemporary information era, significantly accelerated by the advent of Large-scale LLMs, the proliferation of scientific literature is reaching unprecedented levels. Researchers urgently require efficient tools for reading and summarizing academic papers, uncovering significant scientific literature, and employing diverse interpretative methodologies. To address this burgeoning demand, the role of automated scientific literature interpretation systems has become paramount. However, prevailing models, both commercial and open-source, confront notable challenges: they often overlook multimodal data, grapple with summarizing over-length texts, and lack diverse user interfaces. In response, we introduce an open-source multi-modal automated academic paper interpretation system (MMAPIS) with three-step process stages, incorporating LLMs to augment its functionality. Our system first employs the hybrid modality preprocessing and alignment module to extract plain text, and tables or figures from documents separately. It then aligns this information based on the section names they belong to, ensuring that data with identical section names are categorized under the same section. Following this, we introduce a hierarchical discourse-aware summarization method. It utilizes the extracted section names to divide the article into shorter text segments, facilitating specific summarizations both within and between sections via LLMs with specific prompts. Finally, we have designed four types of diversified user interfaces, including paper recommendation, multimodal Q&A, audio broadcasting, and interpretation blog, which can be widely applied across various scenarios. Our qualitative and quantitative evaluations underscore the system's superiority, especially in scientific summarization, where it outperforms solutions relying solely on GPT-4.

PDF HTML Abstract

Summarize Bookmark Chat (Pro)

References (31)

Authors (3)

Feng Jiang (97 papers)
Kuang Wang (3 papers)
Haizhou Li (285 papers)

Citations (3)

View on Semantic Scholar

Tweets

YouTube

Show All Videos

Bridging Research and Readers: A Multi-Modal Automated Academic Papers Interpretation System (2401.09150v1)

Related Papers

Tweets

YouTube