Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
175 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Rethinking Performance Measures of RNA Secondary Structure Problems (2401.05351v1)

Published 4 Dec 2023 in q-bio.BM and cs.LG

Abstract: Accurate RNA secondary structure prediction is vital for understanding cellular regulation and disease mechanisms. Deep learning (DL) methods have surpassed traditional algorithms by predicting complex features like pseudoknots and multi-interacting base pairs. However, traditional distance measures can hardly deal with such tertiary interactions and the currently used evaluation measures (F1 score, MCC) have limitations. We propose the Weisfeiler-Lehman graph kernel (WL) as an alternative metric. Embracing graph-based metrics like WL enables fair and accurate evaluation of RNA structure prediction algorithms. Further, WL provides informative guidance, as demonstrated in an RNA design experiment.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (50)
  1. Transcriptome sequencing across a prostate cancer cohort identifies pcat-1, an unannotated lincrna implicated in disease progression. Nature Biotechnology, 29(8):742–749, 2011.
  2. Long noncoding rna snhg1 promotes neuroinflammation in parkinson’s disease via regulating mir-7/nlrp3 pathway. Neuroscience, 388:118 – 127, 2018. ISSN 0306-4522.
  3. Rna motifs and combinatorial prediction of interactions, stability and localization of noncoding rnas. Nature Structural & Molecular Biology, 25:1070–1076, 2018.
  4. Designing rna secondary structures is hard. Journal of Computational Biology, 27(3):302–316, 2020.
  5. How rna folds. Journal of molecular biology, 293(2):271–281, 1999.
  6. Rna secondary structure prediction using an ensemble of two-dimensional deep neural networks and transfer learning. Nature communications, 10(1):1–13, 2019.
  7. A new method of rna secondary structure prediction based on convolutional neural network and dynamic programming. Frontiers in genetics, 10:467, 2019.
  8. Learning to fold rnas in linear time. bioRxiv, page 852871, 2019.
  9. Improved rna secondary structure and tertiary base-pairing prediction using evolutionary profile, mutational coupling and two-dimensional transfer learning. Bioinformatics, 37, 2021.
  10. Rna secondary structure prediction with convolutional neural networks. BMC bioinformatics, 23(1):58, 2022.
  11. Rna secondary structure packages evaluated and improved by high-throughput experiments. Nature Methods, 19(10):1234–1242, 2022.
  12. Rtfold: Rna secondary structure prediction using deep learning with domain inductive bias.
  13. Probabilistic transformer: Modelling ambiguities and distributions for rna folding and molecule design. Advances in Neural Information Processing Systems, 35:26856–26873, 2022.
  14. Interpretable rna foundation model from unannotated data for highly accurate rna structure and function predictions. arXiv preprint arXiv:2204.00300, 2022.
  15. Redfold: accurate rna secondary structure prediction using residual encoder-decoder network. BMC bioinformatics, 24(1):1–13, 2023.
  16. Scalable deep learning for rna secondary structure prediction. arXiv preprint arXiv:2307.10073, 2023.
  17. Pseudoknots: Rna structures with diverse functions. PLoS biology, 3(6):e213, 2005.
  18. Simple fast algorithms for the editing distance between trees and related problems. SIAM journal on computing, 18(6):1245–1262, 1989.
  19. Philip N Klein. Computing the edit-distance between unrooted ordered trees. In European Symposium on Algorithms, pages 91–102. Springer, 1998.
  20. An optimal decomposition algorithm for tree edit distance. ACM Transactions on Algorithms (TALG), 6(1):1–19, 2009.
  21. Local similarity in rna secondary structures. In Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003, pages 159–168. IEEE, 2003.
  22. An algebraic language for rna pseudoknots comparison. BMC bioinformatics, 20(4):1–18, 2019.
  23. Helix formation by guanylic acid. Proceedings of the National Academy of Sciences, 48(12):2013–2018, 1962.
  24. Brian W Matthews. Comparison of the predicted and observed secondary structure of t4 phage lysozyme. Biochimica et Biophysica Acta (BBA)-Protein Structure, 405(2):442–451, 1975.
  25. The advantages of the matthews correlation coefficient (mcc) over f1 score and accuracy in binary classification evaluation. BMC genomics, 21:1–13, 2020.
  26. A caged uridine for the selective preparation of an rna fold and determination of its refolding kinetics by real-time nmr. ChemBioChem, 7(3):417–420, 2006.
  27. David H Mathews. How to benchmark rna secondary structure prediction accuracy. Methods, 162:60–67, 2019.
  28. Proton nuclear magnetic resonance studies on bulge-containing dna oligonucleotides from a mutational hot-spot sequence. Biochemistry, 26(3):904–912, 1987.
  29. Weisfeiler-lehman graph kernels. Journal of Machine Learning Research, 12(9), 2011.
  30. Efficient randomized pattern-matching algorithms. IBM journal of research and development, 31(2):249–260, 1987.
  31. Fast Folding and Comparison of RNA Secondary Structures. Monatshefte fuer Chemie/Chemical Monthly, 125:167–188, 02 1994.
  32. Contrafold: Rna secondary structure prediction without physics-based models. Bioinformatics, 22(14):e90–e98, 2006.
  33. Rnastructure: software for rna secondary structure prediction and analysis. BMC bioinformatics, 11(1):1–9, 2010.
  34. Ipknot: fast and accurate prediction of rna secondary structures with pseudoknots using integer programming. Bioinformatics, 27(13):i85–i93, 2011.
  35. The rna shapes studio. Bioinformatics, 31(3):423–425, 2015.
  36. Linearfold: linear-time approximate rna folding by 5’-to-3’dynamic programming and beam search. Bioinformatics, 35(14):i295–i304, 2019.
  37. Protein data bank: the single global archive for 3d macromolecular structure data. Nucleic acids research, 47(D1):D520–D528, 2019.
  38. RNAcentral Consortium. RNAcentral 2021: secondary structure integration, improved sequence search and new member databases. Nucleic Acids Research, 49(D1):D212–D220, 10 2020. ISSN 0305-1048. doi: 10.1093/nar/gkaa921.
  39. A bulge structure in hiv-1 tar rna is required for tat binding and tat-mediated trans-activation. Genes & development, 4(8):1365–1373, 1990.
  40. The apical loop of the hiv-1 tar rna hairpin is stabilized by a cross-loop base pair. Journal of Biological Chemistry, 278(40):38892–38901, 2003.
  41. Exosomes derived from hiv-1-infected cells promote growth and progression of cancer via hiv tar rna. Nature communications, 9(1):4585, 2018.
  42. A novel higher-order weisfeiler-lehman graph convolution. In Asian Conference on Machine Learning, pages 49–64. PMLR, 2020.
  43. Towards automated design of riboswitches. arXiv preprint arXiv:2307.08801, 2023.
  44. Learning to design RNA. In International Conference on Learning Representations, 2019.
  45. Redesigning the eterna100 for the vienna 2 folding engine. bioRxiv, pages 2021–08, 2021.
  46. Principles for predicting RNA secondary structure design difficulty. Journal of molecular biology, 428(5):748–757, 2016.
  47. Structural and energetic features of base–base stacking contacts in rna. Journal of Chemical Information and Modeling, 63(2):655–669, 2023.
  48. Functional complexity and regulation through rna dynamics. Nature, 482(7385):322–330, 2012.
  49. The roles of structural dynamics in the cellular functions of rnas. Nature reviews Molecular cell biology, 20(8):474–489, 2019.
  50. De novo design of a synthetic riboswitch that regulates transcription termination . Nucleic Acids Research, 41(4):2541–2551, 12 2012. ISSN 0305-1048.
Citations (2)

Summary

We haven't generated a summary for this paper yet.