Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Reconstruction of Sets of Strings from Prefix/Suffix Compositions (2110.02352v1)

Published 5 Oct 2021 in cs.IT and math.IT

Abstract: The problem of reconstructing strings from substring information has found many applications due to its importance in genomic data sequencing and DNA- and polymer-based data storage. One practically important and challenging paradigm requires reconstructing mixtures of strings based on the union of compositions of their prefixes and suffixes, generated by mass spectrometry devices. We describe new coding methods that allow for unique joint reconstruction of subsets of strings selected from a code and provide upper and lower bounds on the asymptotic rate of the underlying codebooks. Our code constructions combine properties of binary Bh and Dyck strings and that can be extended to accommodate missing substrings in the pool. As auxiliary results, we obtain the first known bounds on binary Bh sequences for arbitrary even parameters h, and also describe various error models inherent to mass spectrometry analysis. This paper contains a correction of the prior work by the authors, published in [24]. In particular, the bounds on the prefix codes are now corrected.

Citations (14)

Summary

We haven't generated a summary for this paper yet.