Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
126 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Near-Optimal Average-Case Approximate Trace Reconstruction from Few Traces (2107.11530v2)

Published 24 Jul 2021 in cs.DS and cs.DM

Abstract: In the standard trace reconstruction problem, the goal is to \emph{exactly} reconstruct an unknown source string $\mathsf{x} \in {0,1}n$ from independent "traces", which are copies of $\mathsf{x}$ that have been corrupted by a $\delta$-deletion channel which independently deletes each bit of $\mathsf{x}$ with probability $\delta$ and concatenates the surviving bits. We study the \emph{approximate} trace reconstruction problem, in which the goal is only to obtain a high-accuracy approximation of $\mathsf{x}$ rather than an exact reconstruction. We give an efficient algorithm, and a near-matching lower bound, for approximate reconstruction of a random source string $\mathsf{x} \in {0,1}n$ from few traces. Our main algorithmic result is a polynomial-time algorithm with the following property: for any deletion rate $0 < \delta < 1$ (which may depend on $n$), for almost every source string $\mathsf{x} \in {0,1}n$, given any number $M \leq \Theta(1/\delta)$ of traces from $\mathrm{Del}\delta(\mathsf{x})$, the algorithm constructs a hypothesis string $\widehat{\mathsf{x}}$ that has edit distance at most $n \cdot (\delta M){\Omega(M)}$ from $\mathsf{x}$. We also prove a near-matching information-theoretic lower bound showing that given $M \leq \Theta(1/\delta)$ traces from $\mathrm{Del}\delta(\mathsf{x})$ for a random $n$-bit string $\mathsf{x}$, the smallest possible expected edit distance that any algorithm can achieve, regardless of its running time, is $n \cdot (\delta M){O(M)}$.

Citations (11)

Summary

We haven't generated a summary for this paper yet.