Zero-Error Capacity of Duplication Channels (1902.06275v4)
Abstract: This paper is concerned with the problem of error-free communication over the i.i.d. duplication channel which acts on a transmitted sequence $ x_1 \cdots x_n $ by inserting a random number of copies of each symbol $ x_i $ next to the original symbol. The random variables representing the numbers of inserted copies at each position $ i $ are independent and take values in $ {0, 1, \ldots, r} $, where $ r $ is a fixed parameter. A more general model in which blocks of $ \ell $ consecutive symbols are being duplicated, and which is inspired by DNA-based data storage systems wherein the stored molecules are subject to tandem-duplication mutations, is also analyzed. A construction of optimal codes correcting all patterns of errors of this type is described, and the zero-error capacity of the duplication channel---the largest rate at which information can be transmitted through it in an error-free manner---is determined for each $ \ell $ and $ r $.