Trace Reconstruction from Local Statistical Queries

Authors Xi Chen , Anindya De , Chin Ho Lee , Rocco A. Servedio



PDF
Thumbnail PDF

File

LIPIcs.APPROX-RANDOM.2024.52.pdf
  • Filesize: 0.95 MB
  • 24 pages

Document Identifiers

Author Details

Xi Chen
  • Columbia University, New York, NY, USA
Anindya De
  • University of Pennsylvania, Philadelphia, PA, USA
Chin Ho Lee
  • North Carolina State University, Raleigh, NC, USA
Rocco A. Servedio
  • Columbia University, New York, NY, USA

Cite AsGet BibTex

Xi Chen, Anindya De, Chin Ho Lee, and Rocco A. Servedio. Trace Reconstruction from Local Statistical Queries. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 317, pp. 52:1-52:24, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)
https://doi.org/10.4230/LIPIcs.APPROX/RANDOM.2024.52

Abstract

The goal of trace reconstruction is to reconstruct an unknown n-bit string x given only independent random traces of x, where a random trace of x is obtained by passing x through a deletion channel. A Statistical Query (SQ) algorithm for trace reconstruction is an algorithm which can only access statistical information about the distribution of random traces of x rather than individual traces themselves. Such an algorithm is said to be 𝓁-local if each of its statistical queries corresponds to an 𝓁-junta function over some block of 𝓁 consecutive bits in the trace. Since several - but not all - known algorithms for trace reconstruction fall under the local statistical query paradigm, it is interesting to understand the abilities and limitations of local SQ algorithms for trace reconstruction. In this paper we establish nearly-matching upper and lower bounds on local Statistical Query algorithms for both worst-case and average-case trace reconstruction. For the worst-case problem, we show that there is an Õ(n^{1/5})-local SQ algorithm that makes all its queries with tolerance τ ≥ 2^{-Õ(n^{1/5})}, and also that any Õ(n^{1/5})-local SQ algorithm must make some query with tolerance τ ≤ 2^{-Ω̃(n^{1/5})}. For the average-case problem, we show that there is an O(log n)-local SQ algorithm that makes all its queries with tolerance τ ≥ 1/poly(n), and also that any O(log n)-local SQ algorithm must make some query with tolerance τ ≤ 1/poly(n).

Subject Classification

ACM Subject Classification
  • Mathematics of computing → Probabilistic inference problems
Keywords
  • trace reconstruction
  • statistical queries
  • algorithmic statistics

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Frank Ban, Xi Chen, Adam Freilich, Rocco A. Servedio, and Sandip Sinha. Beyond trace reconstruction: Population recovery from the deletion channel. In 60th IEEE Annual Symposium on Foundations of Computer Science (FOCS), pages 745-768. IEEE Computer Society, 2019. Google Scholar
  2. Tuǧkan Batu, Sampath Kannan, Sanjeev Khanna, and Andrew McGregor. Reconstructing strings from random traces. In Proceedings of the Fifteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2004, pages 910-918, 2004. Google Scholar
  3. Peter Borwein and Tamás Erdélyi. Littlewood-type polynomials on subarcs of the unit circle. Indiana University Mathematics Journal, 46(4):1323-1346, 1997. Google Scholar
  4. Peter Borwein, Tamás Erdélyi, and Géza Kós. Littlewood-type problems on [0,1]. Proc. London Math. Soc. (3), 79(1):22-46, 1999. URL: https://doi.org/10.1112/S0024611599011831.
  5. Tatiana Brailovskaya and Miklós Z. Rácz. Tree trace reconstruction using subtraces. J. Appl. Probab., 60(2):629-641, 2023. URL: https://doi.org/10.1017/jpr.2022.81.
  6. Joshua Brakensiek, Ray Li, and Bruce Spang. Coded trace reconstruction in a constant number of traces. In 2020 IEEE 61st Annual Symposium on Foundations of Computer Science (FOCS), pages 482-493, 2020. URL: https://doi.org/10.1109/FOCS46700.2020.00052.
  7. Daniel G. Brown. How I wasted too long finding a concentration inequality for sums of geometric variables. Available at https://uwspace.uwaterloo.ca/bitstream/handle/10012/17210/negbin.pdf?sequence=1, 2011.
  8. Diptarka Chakraborty, Debarati Das, and Robert Krauthgamer. Approximate trace reconstruction via median string (in average-case). In 41st IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS), volume 213 of LIPIcs, pages 11:1-11:23. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2021. Google Scholar
  9. Z. Chase and Y. Peres. Approximate trace reconstruction of random strings from a constant number of traces. Available at https://arxiv.org/abs/2107.06454, 2021.
  10. Zachary Chase. New lower bounds for trace reconstruction. Ann. Inst. H. Poincaré Probab. Statist., 57(2):627-643, 2021. Google Scholar
  11. Zachary Chase. Separating words and trace reconstruction. In STOC '21: 53rd Annual ACM SIGACT Symposium on Theory of Computing, Virtual Event, Italy, June 21-25, 2021, pages 21-31. ACM, 2021. Google Scholar
  12. Xi Chen, Anindya De, Chin Ho Lee, Rocco A. Servedio, and Sandip Sinha. Polynomial-time trace reconstruction in the low deletion rate regime. In 12th Innovations in Theoretical Computer Science Conference, volume 185, pages 20:1-20:20, 2021. Google Scholar
  13. Xi Chen, Anindya De, Chin Ho Lee, Rocco A. Servedio, and Sandip Sinha. Polynomial-time trace reconstruction in the smoothed complexity model. In Proceedings of the ACM-SIAM Symposium on Discrete Algorithms, pages 54-73, 2021. Google Scholar
  14. Xi Chen, Anindya De, Chin Ho Lee, Rocco A. Servedio, and Sandip Sinha. Near-optimal average-case approximate trace reconstruction from few traces. In Proceedings of the 2022 ACM-SIAM Symposium on Discrete Algorithms (SODA 2022), pages 779-821, 2022. Google Scholar
  15. Kuan Cheng, Elena Grigorescu, Xin Li, Madhu Sudan, and Minshen Zhu. On k-mer-based and maximum likelihood estimation algorithms for trace reconstruction. CoRR, abs/2308.14993, 2023. URL: https://doi.org/10.48550/arXiv.2308.14993.
  16. Mahdi Cheraghchi, Ryan Gabrys, Olgica Milenkovic, and João Ribeiro. Coded trace reconstruction. IEEE Trans. Inform. Theory, 66(10):6084-6103, 2020. URL: https://doi.org/10.1109/TIT.2020.2996377.
  17. Sami Davies, Miklós Z. Rácz, and Cyrus Rashtchian. Reconstructing trees from traces. In Alina Beygelzimer and Daniel Hsu, editors, Conference on Learning Theory, COLT 2019, 25-28 June 2019, Phoenix, AZ, USA, volume 99 of Proceedings of Machine Learning Research, pages 961-978. PMLR, 2019. URL: http://proceedings.mlr.press/v99/davies19a.html.
  18. Sami Davies, Miklos Z. Rácz, Cyrus Rashtchian, and Benjamin G. Schiffer. Approximate trace reconstruction: Algorithms. In IEEE International Symposium on Information Theory, 2021. Google Scholar
  19. Anindya De, Ryan O'Donnell, and Rocco A. Servedio. Optimal mean-based algorithms for trace reconstruction. In Proceedings of the 49th ACM Symposium on Theory of Computing (STOC), pages 1047-1056, 2017. Google Scholar
  20. Elena Grigorescu, Madhu Sudan, and Minshen Zhu. Limitations of mean-based algorithms for trace reconstruction at small distance. In IEEE International Symposium on Information Theory, 2021. Google Scholar
  21. Lisa Hartung, Nina Holden, and Yuval Peres. Trace reconstruction with varying deletion probabilities. In Proceedings of the Fifteenth Workshop on Analytic Algorithmics and Combinatorics, ANALCO 2018, New Orleans, LA, USA, January 8-9, 2018., pages 54-61, 2018. Google Scholar
  22. Nina Holden and Russell Lyons. Lower bounds for trace reconstruction. Ann. Appl. Probab., 30(2):503-525, 2020. URL: https://doi.org/10.1214/19-AAP1506.
  23. Nina Holden, Robin Pemantle, and Yuval Peres. Subpolynomial trace reconstruction for random strings and arbitrary deletion probability. In Conference On Learning Theory, COLT 2018, Stockholm, Sweden, 6-9 July 2018, volume 75 of Proceedings of Machine Learning Research, pages 1799-1840. PMLR, 2018. Google Scholar
  24. Nina Holden, Robin Pemantle, Yuval Peres, and Alex Zhai. Subpolynomial trace reconstruction for random strings and arbitrary deletion probability. Mathematical Statistics and Learning, 2(3/4):275-309, 2019. Google Scholar
  25. Thomas Holenstein, Michael Mitzenmacher, Rina Panigrahy, and Udi Wieder. Trace reconstruction with constant deletion probability and related results. In Proceedings of the Nineteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2008, pages 389-398, 2008. Google Scholar
  26. V. V. Kalashnik. Reconstruction of a word from its fragments. Computational Mathematics and Computer Science (Vychislitel'naya matematika i vychislitel'naya tekhnika), Kharkov, 4:56-57, 1973. Google Scholar
  27. M. Kearns. Efficient noise-tolerant learning from statistical queries. Journal of the ACM, 45(6):983-1006, 1998. Google Scholar
  28. Akshay Krishnamurthy, Arya Mazumdar, Andrew McGregor, and Soumyabrata Pal. Trace reconstruction: Generalized and parameterized. In 27th Annual European Symposium on Algorithms, ESA 2019, volume 144 of LIPIcs, pages 68:1-68:25. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2019. Google Scholar
  29. Vladimir Levenshtein. Efficient reconstruction of sequences. IEEE Transactions on Information Theory, 47(1):2-22, 2001. Google Scholar
  30. Vladimir Levenshtein. Efficient reconstruction of sequences from their subsequences or supersequences. Journal of Combinatorial Theory Series A, 93(2):310-332, 2001. Google Scholar
  31. Kayvon Mazooji and Ilan Shomorony. Substring density estimation from traces. In IEEE International Symposium on Information Theory, ISIT 2023, Taipei, Taiwan, June 25-30, 2023, pages 803-808. IEEE, 2023. URL: https://doi.org/10.1109/ISIT54713.2023.10206758.
  32. Andrew McGregor, Eric Price, and Sofya Vorotnikova. Trace reconstruction revisited. In Proceedings of the 22nd Annual European Symposium on Algorithms, pages 689-700, 2014. Google Scholar
  33. Shyam Narayanan. Population recovery from the deletion channel: Nearly matching trace reconstruction bounds. CoRR, abs/2004.06828, 2020. Google Scholar
  34. Shyam Narayanan. Improved algorithms for population recovery from the deletion channel. In Proceedings of the 2021 ACM-SIAM Symposium on Discrete Algorithms, SODA 2021, Virtual Conference, January 10 - 13, 2021, pages 1259-1278. SIAM, 2021. URL: https://doi.org/10.1137/1.9781611976465.77.
  35. Shyam Narayanan and Michael Ren. Circular Trace Reconstruction. In 12th Innovations in Theoretical Computer Science Conference (ITCS 2021), pages 18:1-18:18, 2021. Google Scholar
  36. Fedor Nazarov and Yuval Peres. Trace reconstruction with exp(O(n^1/3)) samples. In Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2017, pages 1042-1046, 2017. Google Scholar
  37. Yuval Peres and Alex Zhai. Average-case reconstruction for the deletion channel: Subpolynomially many traces suffice. In 58th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2017, Berkeley, CA, USA, October 15-17, 2017, pages 228-239. IEEE Computer Society, 2017. Google Scholar
  38. Ittai Rubinstein. Average-case to (shifted) worst-case reduction for the trace reconstruction problem. In 50th International Colloquium on Automata, Languages, and Programming, volume 261 of LIPIcs. Leibniz Int. Proc. Inform., pages Art. No. 102, 20. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2023. URL: https://doi.org/10.4230/lipics.icalp.2023.102.
  39. Jin Sima and Jehoshua Bruck. Trace reconstruction with bounded edit distance. In IEEE International Symposium on Information Theory, 2021. Manuscript, available at URL: https://arxiv.org/abs/2102.05372.
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail