Optimal Almost-Balanced Sequences (2405.08625v1)
Abstract: This paper presents a novel approach to address the constrained coding challenge of generating almost-balanced sequences. While strictly balanced sequences have been well studied in the past, the problem of designing efficient algorithms with small redundancy, preferably constant or even a single bit, for almost balanced sequences has remained unsolved. A sequence is $\varepsilon(n)$-almost balanced if its Hamming weight is between $0.5n\pm \varepsilon(n)$. It is known that for any algorithm with a constant number of bits, $\varepsilon(n)$ has to be in the order of $\Theta(\sqrt{n})$, with $O(n)$ average time complexity. However, prior solutions with a single redundancy bit required $\varepsilon(n)$ to be a linear shift from $n/2$. Employing an iterative method and arithmetic coding, our emphasis lies in constructing almost balanced codes with a single redundancy bit. Notably, our method surpasses previous approaches by achieving the optimal balanced order of $\Theta(\sqrt{n})$. Additionally, we extend our method to the non-binary case considering $q$-ary almost polarity-balanced sequences for even $q$, and almost symbol-balanced for $q=4$. Our work marks the first asymptotically optimal solutions for almost-balanced sequences, for both, binary and non-binary alphabet.
- D. Knuth, “Efficient balanced codes,” IEEE Transactions on Information Theory, vol. 32, no. 1, pp. 51–53, 1986.
- M. Blawat, K. Gaedke, I. Huetter, X.-M. Chen, B. Turczyk, S. Inverso, B. W. Pruitt, and G. M. Church, “Forward error correction for DNA data storage,” Procedia Computer Science, vol. 80, pp. 1011–1022, 2016.
- R. N. Grass, R. Heckel, M. Puddu, D. Paunescu, and W. J. Stark, “Robust chemical preservation of digital information on DNA in silica with error-correcting codes,” Angewandte Chemie Int. Edition, no. 8, pp. 2552–2555, Feb. 2015.
- Y. Erlich and D. Zielinski, “DNA fountain enables a robust and efficient storage architecture,” Science, vol. 355, no. 6328, pp. 950–954, Mar. 2017.
- S. M. H. T. Yazdi, R. Gabrys, and O. Milenkovic, “Portable and error-free DNA-based data storage,” Scientific reports, vol. 7, no. 1, p. 5011, 2017.
- S. M. H. T. Yazdi, H. M. Kiah, E. Garcia-Ruiz, J. Ma, H. Zhao, and O. Milenkovic, “DNA-based storage: Trends and methods,” IEEE Transactions on Molecular, Biological and Multi-Scale Communications, vol. 1, no. 3, pp. 230–248, 2015.
- L. Tallini and B. Bose, “Balanced codes with parallel encoding and decoding,” IEEE Transactions on Computers, vol. 48, no. 8, pp. 794–814, 1999.
- L. Tallini, R. Capocelli, and B. Bose, “Design of some new efficient balanced codes,” IEEE Transactions on Information Theory, vol. 42, no. 3, pp. 790–802, 1996.
- J. H. Weber and K. A. S. Immink, “Knuth’s balanced codes revisited,” IEEE Transactions on Information Theory, vol. 56, no. 4, pp. 1673–1679, 2010.
- K. A. Schouhamer Immink and J. H. Weber, “Very efficient balanced codes,” IEEE Journal on Selected Areas in Communications, vol. 28, no. 2, pp. 188–192, 2010.
- L. G. Tallini and U. Vaccaro, “Efficient m-ary balanced codes,” Discrete Applied Mathematics, vol. 92, no. 1, pp. 17–56, 1999.
- R. Mascella and L. Tallini, “On symbol permutation invariant balanced codes,” in Proceedings of the International Symposium on Information Theory (ISIT), 2005, pp. 2100–2104.
- ——, “Efficient m-ary balanced codes which are invariant under symbol permutation,” IEEE Transactions on Computers, vol. 55, no. 8, pp. 929–946, 2006.
- T. G. Swart and J. H. Weber, “Efficient balancing of q-ary sequences with parallel decoding,” in IEEE International Symposium on Information Theory (ISIT), 2009, pp. 1564–1568.
- J. H. Weber, K. A. S. Immink, P. H. Siegel, and T. G. Swart, “Polarity-balanced codes,” in Information Theory and Applications Workshop (ITA), 2013, pp. 1–5.
- K. A. S. Immink, J. H. Weber, and H. C. Ferreira, “Balanced runlength limited codes using knuth’s algorithm,” in Proceedings of the IEEE International Symposium on Information Theory (ISIT), 2011, pp. 317–320.
- T. T. Nguyen, K. Cai, and K. A. S. Immink, “Binary subblock energy-constrained codes: Knuth’s balancing and sequence replacement techniques,” in IEEE International Symposium on Information Theory (ISIT). IEEE, 2020, pp. 37–41.
- A. Kobovich, O. Leitersdorf, D. Bar-Lev, and E. Yaakobi, “Codes for constrained periodicity,” in IEEE International Symposium on Information Theory and its Applications (ISITA), 2022.
- ——, “Universal framework for parametric constrained coding,” in Proceedings of the IEEE International Symposium on Information Theory (ISIT), 2024.
- D. Bar-Lev, A. Kobovich, O. Leitersdorf, and E. Yaakobi, “Universal framework for parametric constrained coding,” arXiv preprint arXiv:2304.01317, 2023.
- J. Rissanen and G. G. Langdon, “Arithmetic coding,” IBM Journal of research and development, vol. 23, no. 2, pp. 149–162, 1979.