Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
92 tokens/sec
Gemini 2.5 Pro Premium
50 tokens/sec
GPT-5 Medium
22 tokens/sec
GPT-5 High Premium
21 tokens/sec
GPT-4o
97 tokens/sec
DeepSeek R1 via Azure Premium
87 tokens/sec
GPT OSS 120B via Groq Premium
459 tokens/sec
Kimi K2 via Groq Premium
230 tokens/sec
2000 character limit reached

Virtual Cells: Predict, Explain, Discover (2505.14613v3)

Published 20 May 2025 in cs.LG and q-bio.QM

Abstract: Drug discovery is fundamentally a process of inferring the effects of treatments on patients, and would therefore benefit immensely from computational models that can reliably simulate patient responses, enabling researchers to generate and test large numbers of therapeutic hypotheses safely and economically before initiating costly clinical trials. Even a more specific model that predicts the functional response of cells to a wide range of perturbations would be tremendously valuable for discovering safe and effective treatments that successfully translate to the clinic. Creating such virtual cells has long been a goal of the computational research community that unfortunately remains unachieved given the daunting complexity and scale of cellular biology. Nevertheless, recent advances in AI, computing power, lab automation, and high-throughput cellular profiling provide new opportunities for reaching this goal. In this perspective, we present a vision for developing and evaluating virtual cells that builds on our experience at Recursion. We argue that in order to be a useful tool to discover novel biology, virtual cells must accurately predict the functional response of a cell to perturbations and explain how the predicted response is a consequence of modifications to key biomolecular interactions. We then introduce key principles for designing therapeutically-relevant virtual cells, describe a lab-in-the-loop approach for generating novel insights with them, and advocate for biologically-grounded benchmarks to guide virtual cell development. Finally, we make the case that our approach to virtual cells provides a useful framework for building other models at higher levels of organization, including virtual patients. We hope that these directions prove useful to the research community in developing virtual models optimized for positive impact on drug discovery outcomes.

Summary

Virtual Cells: Predict, Explain, Discover

The paper "Virtual Cells: Predict, Explain, Discover" addresses the intricate field of computational models for drug discovery, aiming to develop systems capable of reliably simulating patient responses to therapies at the cellular level. The authors emphasize that creating accurate virtual cells stands as a critical step towards enhancing drug discovery processes. Despite existing challenges posed by the complexity of cellular biology, advances in AI, computation, lab automation, and cellular profiling provide promising opportunities to achieve this goal.

Proposed Framework for Virtual Cells

The authors propose a framework wherein virtual cells are equipped with three key capabilities: Predict, Explain, and Discover (P-E-D). These models should not only predict the functional cellular response to perturbations but also explain these changes via underlying biomolecular mechanisms, ultimately enabling the discovery of novel biological insights.

Predict Functional Responses

Virtual cells should first excel at predicting cellular responses to various perturbations. This entails modeling the holistic functional response, rather than isolated biomolecular interactions, to capture the cumulative behavior of cells accurately. The implementation of AI and ML models trained on expansive datasets can serve as a surrogate for expensive, real-world assays, thereby facilitating hypothesis testing in silico. The principle of predicting relative changes is emphasized, where models account for a cell's state before a perturbation to isolate targeted effects.

Explain Responses Mechanistically

Beyond prediction, virtual cells should provide structured, testable explanations that detail how perturbations result in observed cellular outcomes. This requires framing cellular behavior as modifications to biomolecular interactions. The paper suggests leveraging both AI tools and atomistic simulations to hypothesize dynamic changes to key interactions, supporting mechanistic insights crucial for therapeutic applications.

Discover Novel Biology

Equipped with predictive and explanatory abilities, virtual cells can drive discovery. A lab-in-the-loop system allows for iterative learning, where models propose and test hypotheses to refine their understanding continuously. This parallels scientific theory refinement, where models evolve through falsifications leading to new discoveries. The framework envisions autonomous agents orchestrating these processes, opening a path to developing scientist AIs capable of transformative advancements in drug discovery workflows.

Implications and the Path Forward

The research presented notably refrains from classic mechanistic simulation, suggesting that training models on large interventional datasets using AI infrastructure can achieve predictive and explanatory domains efficiently. While the primary goal is to revolutionize drug discovery, broader applications inspire virtual models at higher biological levels – tissues, organs, and patients – by extending the capabilities described for virtual cells.

Benchmarking Framework

The paper underscores the importance of rigorous benchmarks to assess virtual cell capabilities systematically. Benchmarks should focus on functional responses, cell contexts, perturbations, and evaluate the P-E-D capabilities, aligning them with real-world therapeutic goals. A detailed framework is provided, outlining performance levels and emphasizing a modular, standardized adoption for benchmarking software to encourage widespread, consistent evaluation.

Conclusion

"Virtual Cells: Predict, Explain, Discover" outlines a forward-thinking approach to implementing computational models that stand at the intersection of biology and AI. The proposed framework aims to enhance drug discovery methods by developing virtual cells that accurately predict, explain, and discover novel therapeutic insights. The discussion expands the horizon to higher organizational models, driving personalized medicine advances. This perspective is a call to action for continued collaboration and refinement in research efforts, prioritizing biologically grounded benchmarks to materialize the potential of these virtual models.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

Youtube Logo Streamline Icon: https://streamlinehq.com