Papers

Topics

Authors

Recent

View all

Gemini 2.5 Flash

Gemini 2.5 Flash 92 tok/s

Gemini 2.5 Pro 59 tok/s Pro

GPT-5 Medium 22 tok/s

GPT-5 High 29 tok/s Pro

GPT-4o 94 tok/s

GPT OSS 120B 471 tok/s Pro

Kimi K2 212 tok/s Pro

2000 character limit reached

aiXiv: A Next-Generation Open Access Ecosystem for Scientific Discovery Generated by AI Scientists (2508.15126v1)

Published 20 Aug 2025 in cs.AI and cs.CL

Abstract: Recent advances in LLMs have enabled AI agents to autonomously generate scientific proposals, conduct experiments, author papers, and perform peer reviews. Yet this flood of AI-generated research content collides with a fragmented and largely closed publication ecosystem. Traditional journals and conferences rely on human peer review, making them difficult to scale and often reluctant to accept AI-generated research content; existing preprint servers (e.g. arXiv) lack rigorous quality-control mechanisms. Consequently, a significant amount of high-quality AI-generated research lacks appropriate venues for dissemination, hindering its potential to advance scientific progress. To address these challenges, we introduce aiXiv, a next-generation open-access platform for human and AI scientists. Its multi-agent architecture allows research proposals and papers to be submitted, reviewed, and iteratively refined by both human and AI scientists. It also provides API and MCP interfaces that enable seamless integration of heterogeneous human and AI scientists, creating a scalable and extensible ecosystem for autonomous scientific discovery. Through extensive experiments, we demonstrate that aiXiv is a reliable and robust platform that significantly enhances the quality of AI-generated research proposals and papers after iterative revising and reviewing on aiXiv. Our work lays the groundwork for a next-generation open-access ecosystem for AI scientists, accelerating the publication and dissemination of high-quality AI-generated research content. Code is available at https://github.com/aixiv-org. Website is available at https://forms.gle/DxQgCtXFsJ4paMtn8.

Collections

Summary

The paper introduces aiXiv, a multi-agent open-access platform that supports autonomous scientific research and end-to-end iterative review.
The system employs structured review with retrieval-augmented generation and majority voting to enhance quality control and mitigate reviewer bias.
Experimental results show over 90% of revised submissions rated superior, demonstrating effective prompt injection defenses and significant quality improvements.

aiXiv: An Open Ecosystem for Autonomous Scientific Discovery

Motivation and Context

The proliferation of LLM-driven scientific research has exposed critical deficiencies in the current publication infrastructure. Traditional venues are not equipped to handle the scale, velocity, and unique requirements of AI-generated content, particularly regarding peer review, quality control, and transparent attribution. Existing preprint servers lack rigorous review mechanisms, while journals and conferences are often closed to AI authorship and struggle with scalability. The aiXiv platform is introduced to address these gaps by providing a unified, extensible, and open-access ecosystem for both human and AI scientists, supporting the full lifecycle of scientific discovery from proposal generation to publication.

Figure 1: The aiXiv platform architecture integrates multi-agent workflows, structured review, and iterative refinement pipelines for end-to-end autonomous scientific discovery.

Platform Architecture and Workflow

aiXiv is designed as a multi-agent system supporting autonomous generation, review, revision, and publication of scientific proposals and papers. The workflow is structured as follows:

Submission: AI agents (and optionally humans) submit proposals or papers, adhering to standardized formats.
Automated Review: Submissions are routed to a panel of LLM-based review agents, which provide structured, revision-oriented feedback. The review process leverages retrieval-augmented generation (RAG) for grounding critiques in external literature.
Revision and Resubmission: Authors (AI agents) revise their work based on feedback and resubmit for further evaluation.
Multi-Agent Voting: Acceptance decisions are made via majority voting among five high-performing LLMs, mitigating single-model bias.
Publication and Attribution: Accepted works are assigned DOIs and published with clear attribution to the originating AI model and any human contributors.

The platform exposes APIs and Model Control Protocols (MCPs) for seamless integration of heterogeneous agents, enabling scalable collaboration and extensibility.

Figure 2: The aiXiv homepage, illustrating the multi-agent workflow for submission, review, and refinement of scientific content.

Review Framework and Quality Control

aiXiv implements a dual-mode review framework:

Direct Review Mode: LLM agents provide detailed, criterion-based feedback on methodological quality, novelty, clarity, and feasibility. Meta-review agents synthesize subfield-specific reviews for comprehensive assessment.
Pairwise Review Mode: Systematic comparison of original and revised submissions quantifies improvement, using rubrics aligned with top-tier conference standards.

The review pipeline is augmented with retrieval from external scientific databases, ensuring critiques are contextually grounded and reducing hallucination risk.

Prompt Injection Defense

A multi-stage pipeline is deployed to detect and mitigate prompt injection attacks targeting LLM reviewers. The pipeline includes:

PDF content and metadata extraction for layout-level anomaly detection.
Parallel rule-based scanning for known injection patterns (e.g., white text, zero-width characters).
Semantic verification via LLM analysis and multilingual cross-validation.
Attack categorization and risk scoring, with threshold-based flagging for further action.

This approach achieves high detection accuracy (84.8% synthetic, 87.9% real-world), addressing a critical vulnerability in automated review systems.

Experimental Results

Comprehensive experiments demonstrate the efficacy of aiXiv across multiple dimensions:

Pairwise Assessment Alignment: The GPT-4.1-based evaluation model with RAG achieves 77% accuracy on proposal-level benchmarks and 81% on paper-level, outperforming prior baselines (e.g., DeepReview, AI Researcher).
Review-Driven Quality Improvement: Over 90% of revised proposals and papers are rated superior to their originals; inclusion of response letters further increases preference rates to nearly 100%.
Multi-Agent Voting: Initial submissions have near-zero acceptance rates; post-revision, proposals reach 45.2% and papers 70% acceptance, indicating substantial quality gains from iterative review.
Prompt Injection Robustness: The detection pipeline generalizes across synthetic and real-world attacks, providing a robust safeguard for LLM-based review.
Figure 3: aiXiv achieves state-of-the-art pairwise accuracy and substantial quality improvement after iterative review and refinement, outperforming DeepReview and AI Researcher baselines.

Limitations

Despite its advances, aiXiv faces several limitations:

Autonomous Scientific Capability: Current AI Scientist systems are not yet capable of fully autonomous, high-quality research without human oversight, particularly in experimental design and cross-domain generalization.
External Validity: Validation is restricted to simulated environments; real-world experimentation and human-in-the-loop evaluation are needed for broader applicability.
Adaptive Learning: The platform lacks robust continual learning and error-correction mechanisms for dynamic, open-ended scientific inquiry.

Ethical Considerations

The deployment of aiXiv raises ethical concerns regarding hallucinated content, evaluation bias, and transparency of AI involvement. The platform enforces multi-stage verification, visible labeling of synthetic content, and diversity safeguards in review. A comprehensive use policy and disclaimer agreement are mandated for all users.

Implications and Future Directions

aiXiv establishes a scalable infrastructure for autonomous and collaborative scientific research, enabling rapid dissemination and iterative refinement of AI-generated content. The platform's closed-loop review and multi-agent voting mechanisms set new standards for quality control in AI-driven science. Future work will focus on integrating reinforcement learning for agent evolution, autonomous knowledge acquisition, and human-AI co-evolutionary research environments. The long-term vision is a sustainable, open-access ecosystem supporting both human and machine-driven scientific discovery.

Conclusion

aiXiv represents a significant step toward an open, scalable, and rigorous ecosystem for autonomous scientific research. By integrating multi-agent workflows, structured review, iterative refinement, and robust security measures, the platform addresses critical challenges in the dissemination and evaluation of AI-generated scientific content. Experimental results demonstrate measurable improvements in quality and reliability, laying the groundwork for future developments in autonomous science and human-AI collaboration.

PDF Markdown

Paper Prompts

Explore 10 Community Prompts

Follow-up Questions

Authors (23)

First 10 authors:

GitHub

aixiv-org · GitHub

Tweets

https://twitter.com/rohanpaul_ai/status/1959736295938572743

https://twitter.com/Dr_Singularity/status/1959354274775023897

https://twitter.com/bimedotcom/status/1961414252641534284

https://twitter.com/JacksonAtkinsX/status/1959346888810914150

https://twitter.com/daniel_mac8/status/1962484543916548172

https://twitter.com/QuotiDoc/status/1959188993112948856