Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Permutation Recovery Problem against Deletion Errors for DNA Data Storage (2403.15827v1)

Published 23 Mar 2024 in cs.IT and math.IT

Abstract: Owing to its immense storage density and durability, DNA has emerged as a promising storage medium. However, due to technological constraints, data can only be written onto many short DNA molecules called data blocks that are stored in an unordered way. To handle the unordered nature of DNA data storage systems, a unique address is typically prepended to each data block to form a DNA strand. However, DNA storage systems are prone to errors and generate multiple noisy copies of each strand called DNA reads. Thus, we study the permutation recovery problem against deletions errors for DNA data storage. The permutation recovery problem for DNA data storage requires one to reconstruct the addresses or in other words to uniquely identify the noisy reads. By successfully reconstructing the addresses, one can essentially determine the correct order of the data blocks, effectively solving the clustering problem. We first show that we can almost surely identify all the noisy reads under certain mild assumptions. We then propose a permutation recovery procedure and analyze its complexity.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (14)
  1. S. Singhvi, A. Boruchovsky, H. M. Kiah and E. Yaakobi, “Data-Driven Bee Identification for DNA Strands,” 2023 IEEE International Symposium on Information Theory (ISIT), Taipei, Taiwan, 2023, pp. 797-802.
  2. J. Chrisnata, H. M. Kiah, A. Vardy and E. Yaakobi, “Bee Identification Problem for DNA Strands,” IEEE Journal on Selected Areas in Information Theory, vol. 4, pp. 190-204, 2023.
  3. G. M. Church, Y. Gao, and S. Kosuri. “Next-generation digital information storage in DNA,” Science, vol. 337, no. 6102, pp. 1628–1628, 2012.
  4. N. Goldman, P. Bertone, S. Chen, C. Dessimoz, E. M. LeProust, B. Sipos, and E. Birney. “Towards practical, high-capacity, low-maintenance information storage in synthesized DNA,” Nature, vol. 494, no. 7435, pp. 77–80, 2013.
  5. H. M. Kiah, A. Vardy, and H. Yao, “Efficient bee identification,” IEEE International Symposium on Information Theory (ISIT), pp. 1943–1948, July, 2021.
  6. H. M. Kiah, A. Vardy, and H. Yao, “Efficient algorithms for the bee-identification problem,” arXiv preprint arXiv:2212.09952, 2022.
  7. A. Lenz, P. H. Siegel, A. Wachter-Zeh and E. Yaakobi, “Coding over sets for DNA storage,” IEEE Transactions on Information Theory, vol. 66, no. 4, pp. 2331–2351, April 2020.
  8. L. Organick, S. Ang, Y.J. Chen, R. Lopez, S.Yekhanin, K. Makarychev, M. Racz, G. Kamath, P. Gopalan, B. Nguyen, C. Takahashi, S. Newman, H. Y. Parker, C. Rashtchian, K. Stewart, G. Gupta, R. Carlson, J. Mulligan, D. Carmean, G. Seelig, L. Ceze, and K. Strauss, “Random access in largescale DNA data storage,” Nature Biotechnology, vol. 36, no. 3, pp 242–248, 2018.
  9. C. Rashtchian, K. Makarychev, M. Racz, S. Ang, D. Jevdjic, S. Yekhanin, L. Ceze, and K. Strauss, “Clustering billions of reads for DNA data storage,” Advances in Neural Information Processing Systems, vol. 30, 2017.
  10. I. Shomorony, and R. Heckel, “Information-theoretic foundations of DNA data storage,” Foundations and Trends®in Communications and Information Theory, 19(1), 1–106, 2022
  11. A. Tandon , V.Y.F. Tan, and L.R. Varshney, “The bee-identification problem: Bounds on the error exponent,” IEEE Transactions on Communications, vol. 67, issue no.11, pp. 7405–7416, November, 2019.
  12. S. Yazdi, H. M. Kiah, E. R. Garcia, J. Ma, H. Zhao, and O. Milenkovic, “DNA-based storage: Trends and methods,” IEEE Trans. Molecular, Biological, Multi-Scale Commun., vol. 1, no. 3, pp. 230–248, 2015.
  13. J. Edmonds and R. M. Karp, “Theoretical improvements in algorithmic efficiency for network flow problems,” J. ACM, vol. 19, no. 2, pp. 248-264, 1972.
  14. N. Tomizawa,“On some techniques useful for solution of transportation network problems,” Networks, vol. 1, no. 2, pp. 173-194, 1971.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com