The Chvátal-Sankoff problem: Understanding random string comparison through stochastic processes (2212.01582v2)
Abstract: Given two equally long, uniformly random binary strings, the expected length of their longest common subsequence (LCS) is asymptotically proportional to the strings' length. Finding the proportionality coefficient $\gamma$, i.e. the limit of the normalised LCS length for two random binary strings of length $n \to \infty$, is a very natural problem, first posed by Chv\'atal and Sankoff in 1975, and as yet unresolved. This problem has relevance to diverse fields ranging from combinatorics and algorithm analysis to coding theory and computational biology. Using methods of statistical mechanics, as well as some existing results on the combinatorial structure of LCS, we link constant $\gamma$ to the parameters of a certain stochastic particle process, which we use to obtain a new estimate for $\gamma$.