Maximum number of DNA address sequences satisfying constraints C1–C4
Determine, as a function of the address length n, the largest possible cardinality u(n) of a set of DNA address sequences of length n that simultaneously satisfy four constraints: (C1) each sequence and all sufficiently long prefixes have GC content approximately 50%; (C2) the pairwise Hamming distance between any two sequences is large (e.g., at least half the address length); (C3) the sequences are mutually uncorrelated, meaning no proper prefix of any sequence is a suffix of itself or any other sequence, and vice versa; and (C4) the sequences exhibit no secondary (folding) structures predicted by thermodynamic models.
Sponsor
References
It remains an open problem to determine the largest number of address sequences that jointly satisfy the constraints C1-C4.