Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
184 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A Bird-Eye view on DNA Storage Simulators (2404.04877v1)

Published 7 Apr 2024 in cs.IT, cs.CY, cs.ET, and math.IT

Abstract: In the current world due to the huge demand for storage, DNA-based storage solution sounds quite promising because of their longevity, low power consumption, and high capacity. However in real life storing data in the form of DNA is quite expensive, and challenging. Therefore researchers and developers develop such kind of software that helps simulate real-life DNA storage without worrying about the cost. This paper aims to review some of the software that performs DNA storage simulations in different domains. The paper also explains the core concepts such as synthesis, sequencing, clustering, reconstruction, GC window, K-mer window, etc and some overview on existing algorithms. Further, we present 3 different softwares on the basis of domain, implementation techniques, and customer/commercial usability.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (39)
  1. Petroc Taylor, “Data growth worldwide 2010-2025,” Volume of data/information created, captured, copied, and consumed worldwide from 2010 to 2020, with forecasts from 2021 to 2025, 2021.
  2. Tabatabaei Yazdi, SM and Yuan, Yongbo and Ma, Jian and Zhao, Huimin and Milenkovic, Olgica, “A rewritable, random-access DNA-based storage system,” Scientific reports, vol. 5, no. 1, pp. 1–10, 2015.
  3. Church, George M and Gao, Yuan and Kosuri, Sriram, “Next-generation digital information storage in DNA,” Science, vol. 337, no. 6102, pp. 1628–1628, 2012.
  4. Shah, Shalin and Limbachiya, Dixita and Gupta, Manish K, “DNACloud: A potential tool for storing big data on DNA,” arXiv preprint arXiv:1310.6992, 2013.
  5. Hoshika, Shuichi and Leal, Nicole A and Kim, Myong-Jung and Kim, Myong-Sang and Karalkar, Nilesh B and Kim, Hyo-Joong and Bates, Alison M and Watkins Jr, Norman E and SantaLucia, Holly A and Meyer, Adam J and others, “Hachimoji DNA and RNA: A genetic system with eight building blocks,” Science, vol. 363, no. 6429, pp. 884–887, 2019.
  6. George Church, Don Ingber from Wyss Institute, “DNA Data Storage,” Integrated information storage technology for writing large amounts of digital information in DNA using an enzyme-driven, sustainable, low-cost approach, 2021.
  7. Limbachiya, Dixita and Gupta, Manish K. and Aggarwal, Vaneet, “10 Years of Natural Data Storage,” IEEE Transactions on Molecular, Biological and Multi-Scale Communications, vol. 8, no. 4, pp. 263–275, 2022.
  8. Meiser, Linda C and Antkowiak, Philipp L and Koch, Julian and Chen, Weida D and Kohll, A Xavier and Stark, Wendelin J and Heckel, Reinhard and Grass, Robert N, “Reading and writing digital data in DNA,” Nature protocols, vol. 15, no. 1, pp. 86–101, 2020.
  9. Limbachiya, Dixita and Dhameliya, Vijay and Khakhar, Madhav and Gupta, Manish K, “On optimal family of codes for archival DNA storage,” in 2015 Seventh International Workshop on Signal Design and Its Applications in Communications (IWSDA).   IEEE, 2015, pp. 123–127.
  10. Wang, Penghao and Cao, Ben and Ma, Tao and Wang, Bin and Zhang, Qiang and Zheng, Pan, “DUHI: Dynamically updated hash index clustering method for DNA storage,” Computers in Biology and Medicine, vol. 164, p. 107244, 2023.
  11. Rashtchian, Cyrus and Makarychev, Konstantin and Racz, Miklos and Ang, Siena and Jevdjic, Djordje and Yekhanin, Sergey and Ceze, Luis and Strauss, Karin, “Clustering billions of reads for DNA data storage,” Advances in Neural Information Processing Systems, vol. 30, 2017.
  12. Qu, Guanjin and Yan, Zihui and Wu, Huaming, “Clover: tree structure-based efficient DNA clustering for DNA-based data storage,” Briefings in Bioinformatics, vol. 23, no. 5, p. bbac336, 2022.
  13. Sabary, Omer and Shapira, Guy and Yaakobi, Eitan and Yucovich, Alexander, “Reconstruction algorithms for DNA-storage systems,” Mar. 9 2023, US Patent App. 17/447,066.
  14. Qin, Yun and Zhu, Fei and Xi, Bo, “Robust Multi-Read Reconstruction from Contaminated Clusters Using Deep Neural Network for DNA Storage,” arXiv preprint arXiv:2210.11106, 2022.
  15. Alnasir, Jamie J and Heinis, Thomas and Carteron, Louis, “DNA Storage Error Simulator: A Tool for Simulating Errors in Synthesis, Storage, PCR and Sequencing,” arXiv preprint arXiv:2205.14437, 2022.
  16. Ono, Yukiteru and Asai, Kiyoshi and Hamada, Michiaki, “PBSIM2: a simulator for long-read sequencers with a novel generative model of quality scores,” Bioinformatics, vol. 37, no. 5, pp. 589–595, 2021.
  17. Ono, Yukiteru and Hamada, Michiaki and Asai, Kiyoshi, “PBSIM3: a simulator for all types of PacBio and ONT long reads,” NAR Genomics and Bioinformatics, vol. 4, no. 4, p. lqac092, 2022.
  18. Yang, Chen and Chu, Justin and Warren, René L and Birol, Inanç, “Nanosim: nanopore sequence read simulator based on statistical characterization,” GigaScience, vol. 6, no. 4, p. gix010, 2017.
  19. Wick, Ryan R, “Badread: simulation of error-prone long reads,” Journal of Open Source Software, vol. 4, no. 36, p. 1316, 2019.
  20. Omer Sabary, Gadi Chaykin, Nili Furman, Dvir Ben Shabat, and Eitan Yaakobi, “DNA-STORALATOR: END-TO-END DNA STORAGE SIMULATOR,” Non-Volatile Memory Workshop 2022. Available at: Link, 2022.
  21. Schwarz, Michael and Welzel, Marius and Kabdullayeva, Tolganay and Becker, Anke and Freisleben, Bernd and Heider, Dominik, “MESA: automated assessment of synthetic DNA fragments and simulation of DNA synthesis, storage, sequencing and PCR errors,” Bioinformatics, vol. 36, no. 11, pp. 3322–3326, 2020.
  22. Li, Yu and Han, Renmin and Bi, Chongwei and Li, Mo and Wang, Sheng and Gao, Xin, “DeepSimulator: a deep simulator for Nanopore sequencing,” Bioinformatics, vol. 34, no. 17, pp. 2899–2908, 2018.
  23. Omer Sabari, Eitan Yaakobi, Gadi Chaykin, Nili Furman, “The DNA Storalator. Available at: https://dna.interiaweb.com/undergraduate-projects/project-details/?pid=559,” Technion DNA storage lab, Nov 2022.
  24. Michael, Marius Welzel, “MESA DNA Simulator. Available at: https://mesa.mosla.de/,” DNA synthesis, storage and sequencing simulator, May 2019.
  25. Yu Li, Sheng Wang, Yifan Zhu, Mark Amery, “DeepSimulator. Available at: https://github.com/liyu95/DeepSimulator,” Deepsimulator: a deep simulator for nanopore sequencing, Dec 2017.
  26. Chaykin, Gadi and Furman, Nili and Sabary, Omer and Ben-Shabat, Dvir and Yaakobi, Eitan, “DNA-Storalator: End-to-End DNA Storage Simulator,” 2022.
  27. “Twist Bioscience,” available at: https://www.twistbioscience.com/.
  28. “Custom Array,” available at: https://www.customarrayinc.com/.
  29. “Integrated DNA Technologies(IDT),” available at: https://www.idtdna.com/pages.
  30. “Stutter,” available at: Link.
  31. Shafir, Roy and Sabary, Omer and Anavy, Leon and Yaakobi, Eitan and Yakhini, Zohar, “Sequence Design and Reconstruction Under the Repeat Channel in Enzymatic DNA Synthesis,” IEEE Transactions on Communications, vol. 72, no. 2, pp. 675–691, 2024.
  32. Lee, Henry H and Kalhor, Reza and Goela, Naveen and Bolot, Jean and Church, George M, “Terminator-free template-independent enzymatic DNA synthesis for digital information storage,” Nature communications, vol. 10, no. 1, p. 2383, 2019.
  33. “Illumina MiSeq,” available at: https://sapac.illumina.com/systems/ sequencing-platforms/miseq.html.
  34. “Illumina NextSeq,” available at: https://sapac.illumina.com/systems/ sequencing-platforms/nextseq.html.
  35. “MinION,” available at: Link.
  36. Willems, Thomas and Zielinski, Dina and Yuan, Jie and Gordon, Assaf and Gymrek, Melissa and Erlich, Yaniv, “Genome-wide profiling of heritable and de novo STR variations,” Nature methods, vol. 14, no. 6, pp. 590–592, 2017.
  37. Wang, Dong and Tao, Ruiyang and Li, Zhiqiang and Pan, Dun and Wang, Zhuo and Li, Chengtao and Shi, Yongyong, “STRsearch: a new pipeline for targeted profiling of short tandem repeats in massively parallel sequencing data,” Hereditas, vol. 157, no. 1, pp. 1–9, 2020.
  38. Gopalan, Parikshit S and Yekhanin, Sergey and Ang, Siena Dumas and Jojic, Nebojsa and Racz, Miklos and Strauss, Karen and Ceze, Luis, “Trace reconstruction from noisy polynucleotide sequencer reads,” Jul. 26 2018, US Patent App. 15/536,115.
  39. “JPEG DNA,” available at: https://jpeg.org/jpegdna/index.html.

Summary

We haven't generated a summary for this paper yet.