Document

**Published in:** LIPIcs, Volume 185, 12th Innovations in Theoretical Computer Science Conference (ITCS 2021)

In the trace reconstruction problem, an unknown source string x ∈ {0,1}ⁿ is transmitted through a probabilistic deletion channel which independently deletes each bit with some fixed probability δ and concatenates the surviving bits, resulting in a trace of x. The problem is to reconstruct x given access to independent traces. Trace reconstruction of arbitrary (worst-case) strings is a challenging problem, with the current state of the art for poly(n)-time algorithms being the 2004 algorithm of Batu et al. [T. Batu et al., 2004]. This algorithm can reconstruct an arbitrary source string x ∈ {0,1}ⁿ in poly(n) time provided that the deletion rate δ satisfies δ ≤ n^{-(1/2 + ε)} for some ε > 0.
In this work we improve on the result of [T. Batu et al., 2004] by giving a poly(n)-time algorithm for trace reconstruction for any deletion rate δ ≤ n^{-(1/3 + ε)}. Our algorithm works by alternating an alignment-based procedure, which we show effectively reconstructs portions of the source string that are not "highly repetitive", with a novel procedure that efficiently determines the length of highly repetitive subwords of the source string.

Xi Chen, Anindya De, Chin Ho Lee, Rocco A. Servedio, and Sandip Sinha. Polynomial-Time Trace Reconstruction in the Low Deletion Rate Regime. In 12th Innovations in Theoretical Computer Science Conference (ITCS 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 185, pp. 20:1-20:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)

Copy BibTex To Clipboard

@InProceedings{chen_et_al:LIPIcs.ITCS.2021.20, author = {Chen, Xi and De, Anindya and Lee, Chin Ho and Servedio, Rocco A. and Sinha, Sandip}, title = {{Polynomial-Time Trace Reconstruction in the Low Deletion Rate Regime}}, booktitle = {12th Innovations in Theoretical Computer Science Conference (ITCS 2021)}, pages = {20:1--20:20}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-177-1}, ISSN = {1868-8969}, year = {2021}, volume = {185}, editor = {Lee, James R.}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2021.20}, URN = {urn:nbn:de:0030-drops-135595}, doi = {10.4230/LIPIcs.ITCS.2021.20}, annote = {Keywords: trace reconstruction} }

Document

RANDOM

**Published in:** LIPIcs, Volume 145, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2019)

A number of recent works have considered the trace reconstruction problem, in which an unknown source string x in {0,1}^n is transmitted through a probabilistic channel which may randomly delete coordinates or insert random bits, resulting in a trace of x. The goal is to reconstruct the original string x from independent traces of x. While the asymptotically best algorithms known for worst-case strings use exp(O(n^{1/3})) traces [De et al., 2017; Fedor Nazarov and Yuval Peres, 2017], several highly efficient algorithms are known [Yuval Peres and Alex Zhai, 2017; Nina Holden et al., 2018] for the average-case version of the problem, in which the source string x is chosen uniformly at random from {0,1}^n. In this paper we consider a generalization of the above-described average-case trace reconstruction problem, which we call average-case population recovery in the presence of insertions and deletions. In this problem, rather than a single unknown source string there is an unknown distribution over s unknown source strings x^1,...,x^s in {0,1}^n, and each sample given to the algorithm is independently generated by drawing some x^i from this distribution and returning an independent trace of x^i. Building on the results of [Yuval Peres and Alex Zhai, 2017] and [Nina Holden et al., 2018], we give an efficient algorithm for the average-case population recovery problem in the presence of insertions and deletions. For any support size 1 <= s <= exp(Theta(n^{1/3})), for a 1-o(1) fraction of all s-element support sets {x^1,...,x^s} subset {0,1}^n, for every distribution D supported on {x^1,...,x^s}, our algorithm can efficiently recover D up to total variation distance at most epsilon with high probability, given access to independent traces of independent draws from D. The running time of our algorithm is poly(n,s,1/epsilon) and its sample complexity is poly (s,1/epsilon,exp(log^{1/3} n)). This polynomial dependence on the support size s is in sharp contrast with the worst-case version of the problem (when x^1,...,x^s may be any strings in {0,1}^n), in which the sample complexity of the most efficient known algorithm [Frank Ban et al., 2019] is doubly exponential in s.

Frank Ban, Xi Chen, Rocco A. Servedio, and Sandip Sinha. Efficient Average-Case Population Recovery in the Presence of Insertions and Deletions. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 145, pp. 44:1-44:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)

Copy BibTex To Clipboard

@InProceedings{ban_et_al:LIPIcs.APPROX-RANDOM.2019.44, author = {Ban, Frank and Chen, Xi and Servedio, Rocco A. and Sinha, Sandip}, title = {{Efficient Average-Case Population Recovery in the Presence of Insertions and Deletions}}, booktitle = {Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2019)}, pages = {44:1--44:18}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-125-2}, ISSN = {1868-8969}, year = {2019}, volume = {145}, editor = {Achlioptas, Dimitris and V\'{e}gh, L\'{a}szl\'{o} A.}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX-RANDOM.2019.44}, URN = {urn:nbn:de:0030-drops-112592}, doi = {10.4230/LIPIcs.APPROX-RANDOM.2019.44}, annote = {Keywords: population recovery, deletion channel, trace reconstruction} }

X

Feedback for Dagstuhl Publishing

Feedback submitted

Please try again later or send an E-mail