Codes for Limited-Magnitude Probability Error in DNA Storage (2405.10447v1)
Abstract: DNA, with remarkable properties of high density, durability, and replicability, is one of the most appealing storage media. Emerging DNA storage technologies use composite DNA letters, where information is represented by probability vectors, leading to higher information density and lower synthesizing costs than regular DNA letters. However, it faces the problem of inevitable noise and information corruption. This paper explores the channel of composite DNA letters in DNA-based storage systems and introduces block codes for limited-magnitude probability errors on probability vectors. First, outer and inner bounds for limited-magnitude probability error correction codes are provided. Moreover, code constructions are proposed where the number of errors is bounded by t, the error magnitudes are bounded by l, and the probability resolution is fixed as k. These constructions focus on leveraging the properties of limited-magnitude probability errors in DNA-based storage systems, leading to improved performance in terms of complexity and redundancy. In addition, the asymptotic optimality for one of the proposed constructions is established. Finally, systematic codes based on one of the proposed constructions are presented, which enable efficient information extraction for practical implementation.
- C. T. Clelland, V. Risca, and C. Bancroft, “Hiding messages in DNA microdots,” Nature, vol. 399, no. 6736, pp. 533–534, 1999.
- C. Bancroft, T. Bowler, B. Bloom, and C. T. Clelland, “Long-term storage of information in DNA,” Science, vol. 293, no. 5536, pp. 1763–1765, 2001.
- M. E. Allentoft, M. Collins, D. Harker, J. Haile, C. L. Oskam, M. L. Hale, P. F. Campos, J. A. Samaniego, M. T. P. Gilbert, E. Willerslev et al., “The half-life of DNA in bone: measuring decay kinetics in 158 dated fossils,” Proceedings of the Royal Society B: Biological Sciences, vol. 279, no. 1748, pp. 4724–4733, 2012.
- A. Extance, “How DNA could store all the world’s data,” Nature, vol. 537, no. 7618, 2016.
- J. Bornholt, R. Lopez, D. M. Carmean, L. Ceze, G. Seelig, and K. Strauss, “A DNA-based archival storage system,” in Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems, 2016, pp. 637–649.
- J. Cox, “Long-term data storage in DNA,” Trends in biotechnology, vol. 19, pp. 247–50, 08 2001.
- L. Anavy, I. Vaknin, O. Atar, R. Amit, and Z. Yakhini, “Data storage in DNA with fewer synthesis cycles using composite DNA letters,” Nature Biotechnology, vol. 37, 10 2019.
- Y. Choi, T. Ryu, A. C. Lee, H. Choi, H. Lee, J. Park, S.-H. Song, S. Kim, H. Kim, W. Park et al., “High information capacity DNA-based data storage with augmented encoding characters using degenerate bases,” Scientific Reports, vol. Scientific Reports, 9, no. 1, p. 6582, 2019.
- I. Preuss, Z. Yakhini, and L. Anavy, “Data storage based on combinatorial synthesis of DNA shortmers,” bioRxiv, pp. 2021–08, 2021.
- W. Zhang, Z. Chen, and Z. Wang, “Limited-magnitude error correction for probability vectors in DNA storage,” in ICC 2022 - IEEE International Conference on Communications, 2022, pp. 3460–3465.
- Y. Yan, N. Pinnamaneni, S. Chalapati, C. Crosbie, and R. Appuswamy, “Scaling logical density of DNA storage with enzymatically-ligated composite motifs,” bioRxiv, pp. 2023–02, 2023.
- S. Al-Bassam and B. Bose, “Asymmetric/unidirectional error correcting and detecting codes,” IEEE Transactions on Computers, vol. 43, no. 5, pp. 590–597, 1994.
- M. Blaum and H. Tilborg, van, “On t-error correcting/all unidirectional error detecting codes,” IEEE Transactions on Computers, vol. 38, no. 11, pp. 1493–1501, 1989.
- Bose and D. J. Lin, “Systematic unidirectional error-detecting codes,” IEEE Transactions on Computers, vol. C-34, no. 11, pp. 1026–1032, 1985.
- R. Ahlswede, H. Aydinian, and L. Khachatrian, “Unidirectional error control codes and related combinatorial problems,” 01 2002.
- T. Klove, J. Luo, I. Naydenova, and S. Yari, “Some codes correcting asymmetric errors of limited magnitude,” IEEE Transactions on Information Theory, vol. 57, no. 11, pp. 7459–7472, 2011.
- D. Xie and J. Luo, “Asymmetric single magnitude four error correcting codes,” IEEE Transactions on Information Theory, vol. 66, no. 9, pp. 5322–5334, 2020.
- Y. Cassuto, M. Schwartz, V. Bohossian, and J. Bruck, “Codes for asymmetric limited-magnitude errors with application to multilevel flash memories,” IEEE Transactions on Information Theory, vol. 56, no. 4, pp. 1582–1595, 2010.
- H. Wei and M. Schwartz, “Perfect codes correcting a single burst of limited-magnitude errors,” in 2022 IEEE International Symposium on Information Theory (ISIT), 2022, pp. 1809–1814.
- N. Elarief and B. Bose, “Optimal, systematic, q𝑞qitalic_q-ary codes correcting all asymmetric and symmetric errors of limited magnitude,” IEEE Transactions on Information Theory, vol. 56, no. 3, pp. 979–983, 2010.
- S. B. Gashkov and I. S. Sergeev, “Complexity of computation in finite fields,” Journal of Mathematical Sciences, vol. 191, no. 5, pp. 661–685, 2013.
- R. W. Hamming, “Error detecting and error correcting codes,” The Bell System Technical Journal, vol. 29, no. 2, pp. 147–160, 1950.
- R. C. Bose and D. K. Ray-Chaudhuri, “On a class of error correcting binary group codes,” Information and control, vol. 3, no. 1, pp. 68–79, 1960.
- A. Hocquenghem, “Codes correcteurs d’erreurs,” Chiffers, vol. 2, pp. 147–156, 1959.
- X. Chen and I. S. Reed, “Error-control coding for data networks,” 1999.
- A. Das and N. A. Touba, “Efficient non-binary Hamming codes for limited magnitude errors in MLC PCMs,” in 2018 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFT), 2018, pp. 1–6.