Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

RiboDiffusion: Tertiary Structure-based RNA Inverse Folding with Generative Diffusion Models (2404.11199v1)

Published 17 Apr 2024 in q-bio.BM

Abstract: RNA design shows growing applications in synthetic biology and therapeutics, driven by the crucial role of RNA in various biological processes. A fundamental challenge is to find functional RNA sequences that satisfy given structural constraints, known as the inverse folding problem. Computational approaches have emerged to address this problem based on secondary structures. However, designing RNA sequences directly from 3D structures is still challenging, due to the scarcity of data, the non-unique structure-sequence mapping, and the flexibility of RNA conformation. In this study, we propose RiboDiffusion, a generative diffusion model for RNA inverse folding that can learn the conditional distribution of RNA sequences given 3D backbone structures. Our model consists of a graph neural network-based structure module and a Transformer-based sequence module, which iteratively transforms random sequences into desired sequences. By tuning the sampling weight, our model allows for a trade-off between sequence recovery and diversity to explore more candidates. We split test sets based on RNA clustering with different cut-offs for sequence or structure similarity. Our model outperforms baselines in sequence recovery, with an average relative improvement of $11\%$ for sequence similarity splits and $16\%$ for structure similarity splits. Moreover, RiboDiffusion performs consistently well across various RNA length categories and RNA types. We also apply in-silico folding to validate whether the generated sequences can fold into the given 3D RNA backbones. Our method could be a powerful tool for RNA design that explores the vast sequence space and finds novel solutions to 3D structural constraints.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (10)
  1. Simrna: a coarse-grained method for rna folding simulations and 3d structure prediction. Nucleic acids research, 44(7):e63–e63, 2016.
  2. Pifold: Toward effective and efficient protein inverse folding. In ICLR, 2023.
  3. The vienna rna websuite. Nucleic acids research, 36(suppl_2):W70–W74, 2008.
  4. Generative models for graph-based protein design. NeurIPS, 32, 2019.
  5. Learning from protein structure with geometric vector perceptrons. In ICLR, 2021.
  6. Integrating end-to-end learning with deep geometrical potentials for ab initio rna structure prediction. Nature Communications, 14(1):5745, 2023.
  7. Dssr: an integrated software tool for dissecting the spatial structure of rna. Nucleic acids research, 43(21):e142–e142, 2015.
  8. Learning to design rna. In ICLR, 2019.
  9. E2efold-3d: end-to-end deep learning method for accurate de novo rna 3d structure prediction. arXiv preprint arXiv:2207.01586, 2022.
  10. trrosettarna: automated prediction of rna 3d structure with transformer network. Nature Communications, 14(1):7266, 2023.
Citations (4)

Summary

We haven't generated a summary for this paper yet.

Github Logo Streamline Icon: https://streamlinehq.com