- The paper presents a three-stage framework delineating the transition from meta-scientific integration to autonomous FM-driven discovery in science.
- FMs enhance data curation, experiment design, hypothesis generation, and simulation, thereby boosting efficiency and innovation across research paradigms.
- The work highlights emerging risks such as bias, reproducibility challenges, and ethical concerns as FMs assume greater epistemic agency.
Foundation Models in Scientific Discovery: Stages, Capabilities, and the Paradigm Transition
Introduction: Scientific Paradigms and the Role of Foundation Models
Scientific discovery has historically advanced through discrete epistemic paradigms—empirical experimentation, theoretical abstraction, computational simulation, and large-scale data-driven inference—each with unique strengths and shortcomings. As scientific challenges have become increasingly complex and cross-disciplinary, the limitations of these isolated paradigms have become acute: issues of irreducible complexity, combinatorial explosion, data deluge, and the mismatch between linear models and dynamic phenomena persist across the sciences.
Foundation Models (FMs)—large-scale neural architectures pretrained on heterogeneous data and adaptable to a broad spectrum of tasks—are now emerging as a critical response to these challenges. FMs such as GPT-4, AlphaFold, and domain-specialized models (e.g., for materials science or weather) have demonstrated not just performance improvements within established workflows, but also new types of epistemic agency in problem formulation, hypothesis generation, experimental automation, and even theory-building. This paper posits that FMs catalyze a transition in the scientific paradigm, from incremental workflow enhancement to a novel regime of hybrid and autonomous scientific discovery.
The Three-Stage Framework for FM Integration
The authors introduce a conceptual framework for the evolution of FMs in scientific discovery, defined by the degree of FM autonomy and epistemic agency:
Figure 1: Foundation models transition from tool-like infrastructure in meta-scientific integration, to collaborative co-creation, and ultimately to autonomous agents capable of end-to-end scientific discovery.
- Meta-Scientific Integration: FMs act as intelligent infrastructure, improving data interoperability, automating repetitive tasks (e.g., data curation, literature review), and boosting throughput within traditional paradigms. They serve as backend enablers but remain epistemically inert—fully subordinate to human-defined objectives.
- Hybrid Human-AI Co-Creation: FMs mature from tools to active collaborators, participating in ideation, hypothesis generation, experiment planning, and partial task execution. This regime reconfigures human–machine cognitive labor—FMs provide creativity, generalization, and memory over large knowledge spaces, while humans guide judgment, ethical oversight, and the framing of research goals.
- Autonomous Scientific Discovery: FMs increasingly operate as self-directed agents, performing the full scientific cycle—posing questions, generating hypotheses, engineering and executing experiments or simulations, interpreting results, and iteratively updating internal models. This stage marks a qualitative transition: the locus of scientific agency shifts partially from human to machine, raising foundational questions around validation, authorship, and epistemic norms.
FM Integration Across Classical Scientific Paradigms
Experimental Paradigm
FMs are extensively applied in experiment design optimization and laboratory automation. They serve as priors and feature extractors for Bayesian optimization and active learning workflows in experimental sciences, achieving data-efficiency gains and accelerating convergence in high-dimensional search spaces (e.g., molecular discovery, quantum experiments). FMs are increasingly leveraged as planners and interface layers in robotic laboratories, converting natural language objectives to executable code or hardware control instructions [Boiko2023; Ruan2024; Yoshikawa2023]. This enables end-to-end automated experimental pipelines, closed-loop optimization, and adaptive experimental correction in real time.
Theoretical Paradigm
FMs augment hypothesis generation and formal reasoning by synthesizing knowledge from large corpora, drawing on graph-based ontologies, and embedding physical constraints. Advances in FM-guided symbolic regression, program synthesis, and theorem proving (e.g., LeanCopilot, DeepSeekProver) allow for generalized, scalable formal model generation and validation. Neuro-symbolic architectures combining LLMs with symbolic logic engines facilitate deductive inference, counterexample generation, and falsifiability analysis, supporting automated theoretical advancement in mathematics and the physical sciences.
Computational Paradigm
FM surrogates now routinely outperform or accelerate classical numerical methods in simulation. Data-driven operators (e.g., DeepONet, Neural Operators, FactFormer, GraphCast) enable mesh-free, resolution-agnostic solutions to PDEs and other high-dimensional systems, supporting both forward and inverse modeling workflows. Generative models (e.g., DiffusionPDE, Latent Neural Operators) operate as implicit solvers and enable efficient surrogate inversion in ill-posed regimes. These advances facilitate physically grounded, scalable simulation pipelines for climate modeling, fluid dynamics, materials science, and quantum systems.
Data-driven Paradigm
FMs unify multimodal, high-dimensional data ingestion (e.g., sequences, images, text, structure) and representation learning, allowing for interpretable feature extraction across domains such as genomics (e.g., DNABERT), proteomics (e.g., AlphaFold, ESMFold), chemistry (MoLFormer, ChemVLM), and climate (ClimaX, GraphCast). Predictive tasks are reframed as conditional generation or retrieval, with FMs achieving or surpassing state-of-the-art in protein structure prediction, molecular property inference, geospatial data interpolation, and event forecasting. Importantly, FMs also enable zero-shot and cross-modal transfer learning, lowering annotation bottlenecks.
Cross-Paradigm Integration
Beyond the boundaries of any single paradigm, FMs function as cross-disciplinary orchestrators. They can integrate symbolic theory with empirical data, drive closed-loop experiment-simulation cycles, and serve as research assistants or agents capable of literature synthesis, experimental suggestion, and result analysis. Multimodal and agentic FM deployments (e.g., COSCIENTIST, Agent Laboratory) demonstrate the feasibility of unified, end-to-end scientific workflows that couple theory, simulation, experimentation, and data analytics within a single architecture.
Risks and Challenges of FM-Centric Scientific Paradigms
As FMs gain autonomy and epistemic responsibility, new risks emerge that extend beyond conventional concerns of performance and generalization:
- Bias and Epistemic Fairness: FMs inherit, amplify, or generate new biases due to non-representative training data and cultural/economic epistemic dominance. As they progress toward agenda-setting and autonomous discovery, such biases risk epistemic homogenization and exclusion.
- Hallucination and Misinformation: FMs systematically generate plausible but unverified or unfalsifiable outputs, particularly in hypothesis generation and result interpretation, with potential for propagating errors or misleading scientific discourse.
- Reproducibility and Transparency: Without interpretable reasoning paths, deterministic model checkpoints, and provenance for multi-step FM-driven research, reproducibility and independent validation may be undermined.
- Authorship, Accountability, and Ethics: Machine agency introduces novel questions around credit, liability, and the moral status of non-human discoverers, especially when outputs have genuine novelty or implications for human health/safety.
Pathways Toward Autonomous Scientific Discovery
To address these challenges and realize the full potential of FMs in science, the following research trajectories are identified:
- Embodied Scientific Agents: FM-empowered robotics, digital twins, and automated labs will integrate high-level planning with low-level control execution, supporting real-time, physically grounded hypothesis testing and iterative refinement.
- Closed-Loop Autonomy: Research agents integrating reinforcement learning, planning-as-inference, and neuro-symbolic architectures will continuously close the loop between hypothesis, action, feedback, and revision, moving beyond open-loop, prompt-driven systems.
- Continual Learning and Generalization: Effective scientific FMs will require memory-augmented architectures, lifelong learning capabilities, and robust transfer mechanisms that allow persistent, adaptive accumulation and updating of cross-domain knowledge over time, while avoiding catastrophic forgetting and domain drift.
Conclusion
Foundation models are enacting a trajectory from workflow enhancement to paradigm-level transition in science. By providing unified, multimodal, and increasingly autonomous reasoning, they are transforming both the logistics and epistemology of scientific discovery—from meta-scientific integration, through hybrid co-creation, to autonomous agentic discovery. This shift necessitates new governance, validation, and ethical frameworks, as well as theoretical re-examination of scientific agency itself. Future research should focus on advancing FM capabilities, mitigating epistemic risks, and rigorously implementing mechanisms for accountability and interpretability as FMs take on larger roles in the scientific process.