Polynomial-Time Trace Reconstruction in the Low Deletion Rate Regime

Authors Xi Chen, Anindya De, Chin Ho Lee, Rocco A. Servedio, Sandip Sinha



PDF
Thumbnail PDF

File

LIPIcs.ITCS.2021.20.pdf
  • Filesize: 0.63 MB
  • 20 pages

Document Identifiers

Author Details

Xi Chen
  • Columbia University, New York, NY, USA
Anindya De
  • University of Pennsylvania, Philadelphia, PA, USA
Chin Ho Lee
  • Columbia University, New York, NY, USA
Rocco A. Servedio
  • Columbia University, New York, NY, USA
Sandip Sinha
  • Columbia University, New York, NY, USA

Cite AsGet BibTex

Xi Chen, Anindya De, Chin Ho Lee, Rocco A. Servedio, and Sandip Sinha. Polynomial-Time Trace Reconstruction in the Low Deletion Rate Regime. In 12th Innovations in Theoretical Computer Science Conference (ITCS 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 185, pp. 20:1-20:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)
https://doi.org/10.4230/LIPIcs.ITCS.2021.20

Abstract

In the trace reconstruction problem, an unknown source string x ∈ {0,1}ⁿ is transmitted through a probabilistic deletion channel which independently deletes each bit with some fixed probability δ and concatenates the surviving bits, resulting in a trace of x. The problem is to reconstruct x given access to independent traces. Trace reconstruction of arbitrary (worst-case) strings is a challenging problem, with the current state of the art for poly(n)-time algorithms being the 2004 algorithm of Batu et al. [T. Batu et al., 2004]. This algorithm can reconstruct an arbitrary source string x ∈ {0,1}ⁿ in poly(n) time provided that the deletion rate δ satisfies δ ≤ n^{-(1/2 + ε)} for some ε > 0. In this work we improve on the result of [T. Batu et al., 2004] by giving a poly(n)-time algorithm for trace reconstruction for any deletion rate δ ≤ n^{-(1/3 + ε)}. Our algorithm works by alternating an alignment-based procedure, which we show effectively reconstructs portions of the source string that are not "highly repetitive", with a novel procedure that efficiently determines the length of highly repetitive subwords of the source string.

Subject Classification

ACM Subject Classification
  • Mathematics of computing → Probabilistic inference problems
Keywords
  • trace reconstruction

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Alexandr Andoni, Constantinos Daskalakis, Avinatan Hassidim, and Sebastien Roch. Global alignment of molecular sequences via ancestral state reconstruction. Stochastic Processes and their Applications, 122(12):3852-3874, 2012. Google Scholar
  2. T. Batu, S. Kannan, S. Khanna, and A. McGregor. Reconstructing strings from random traces. In Proceedings of the Fifteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2004, pages 910-918, 2004. Google Scholar
  3. Z. Chase. New lower bounds for trace reconstruction. CoRR, abs/1905.03031, 2019. URL: http://arxiv.org/abs/1905.03031.
  4. Anindya De, Ryan O'Donnell, and Rocco A. Servedio. Optimal mean-based algorithms for trace reconstruction. In Proceedings of the 49th ACM Symposium on Theory of Computing (STOC), pages 1047-1056, 2017. Google Scholar
  5. N. Holden and R. Lyons. Lower bounds for trace reconstruction. CoRR, abs/1808.02336, 2018. URL: http://arxiv.org/abs/1808.02336.
  6. Nina Holden, Robin Pemantle, and Yuval Peres. Subpolynomial trace reconstruction for random strings and arbitrary deletion probability. CoRR, abs/1801.04783, 2018. URL: http://arxiv.org/abs/1801.04783.
  7. Nina Holden, Robin Pemantle, Yuval Peres, and Alex Zhai. Subpolynomial trace reconstruction for random strings and arbitrary deletion probability. CoRR, abs/1801.04783, 2020. URL: http://arxiv.org/abs/1801.04783.
  8. T. Holenstein, M. Mitzenmacher, R. Panigrahy, and U. Wieder. Trace reconstruction with constant deletion probability and related results. In Proceedings of the Nineteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2008, pages 389-398, 2008. Google Scholar
  9. V. V. Kalashnik. Reconstruction of a word from its fragments. Computational Mathematics and Computer Science (Vychislitel'naya matematika i vychislitel'naya tekhnika), Kharkov, 4:56-57, 1973. Google Scholar
  10. Vladimir Levenshtein. Efficient reconstruction of sequences. IEEE Transactions on Information Theory, 47(1):2-22, 2001. Google Scholar
  11. Vladimir Levenshtein. Efficient reconstruction of sequences from their subsequences or supersequences. Journal of Combinatorial Theory Series A, 93(2):310-332, 2001. Google Scholar
  12. Andrew McGregor, Eric Price, and Sofya Vorotnikova. Trace reconstruction revisited. In Proceedings of the 22nd Annual European Symposium on Algorithms, pages 689-700, 2014. Google Scholar
  13. Michael Mitzenmacher. A survey of results for deletion channels and related synchronization channels. Probability Surveys, 6:1-33, 2009. Google Scholar
  14. Fedor Nazarov and Yuval Peres. Trace reconstruction with exp(o(n^1/3)) samples. In Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2017, pages 1042-1046, 2017. Google Scholar
  15. Lee Organick, Siena Dumas Ang, Yuan-Jyue Chen, Randolph Lopez, Sergey Yekhanin, Konstantin Makarychev, Miklos Z Racz, Govinda Kamath, Parikshit Gopalan, Bichlien Nguyen, et al. Random access in large-scale dna data storage. Nature biotechnology, 36(3):242, 2018. Google Scholar
  16. S.M. Hossein Tabatabaei Yazdi, Ryan Gabrys, and Olgica Milenkovic. Portable and error-free DNA-based data storage. Scientific Reports, 7(1):5011, 2017. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail