Recovery from Non-Decomposable Distance Oracles

Authors Zhuangfei Hu, Xinda Li, David P. Woodruff, Hongyang Zhang, Shufan Zhang



PDF
Thumbnail PDF

File

LIPIcs.ITCS.2023.73.pdf
  • Filesize: 1.05 MB
  • 22 pages

Document Identifiers

Author Details

Zhuangfei Hu
  • University of Waterloo, Canada
Xinda Li
  • University of Waterloo, Canada
David P. Woodruff
  • Carnegie Mellon University, Pittsburgh, PA, USA
Hongyang Zhang
  • University of Waterloo, Canada
Shufan Zhang
  • University of Waterloo, Canada

Cite AsGet BibTex

Zhuangfei Hu, Xinda Li, David P. Woodruff, Hongyang Zhang, and Shufan Zhang. Recovery from Non-Decomposable Distance Oracles. In 14th Innovations in Theoretical Computer Science Conference (ITCS 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 251, pp. 73:1-73:22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)
https://doi.org/10.4230/LIPIcs.ITCS.2023.73

Abstract

A line of work has looked at the problem of recovering an input from distance queries. In this setting, there is an unknown sequence s ∈ {0,1}^{≤ n}, and one chooses a set of queries y ∈ {0,1}^𝒪(n) and receives d(s,y) for a distance function d. The goal is to make as few queries as possible to recover s. Although this problem is well-studied for decomposable distances, i.e., distances of the form d(s,y) = ∑_{i=1}^n f(s_i, y_i) for some function f, which includes the important cases of Hamming distance, 𝓁_p-norms, and M-estimators, to the best of our knowledge this problem has not been studied for non-decomposable distances, for which there are important special cases such as edit distance, dynamic time warping (DTW), Fréchet distance, earth mover’s distance, and so on. We initiate the study and develop a general framework for such distances. Interestingly, for some distances such as DTW or Fréchet, exact recovery of the sequence s is provably impossible, and so we show by allowing the characters in y to be drawn from a slightly larger alphabet this then becomes possible. In a number of cases we obtain optimal or near-optimal query complexity. We also study the role of adaptivity for a number of different distance functions. One motivation for understanding non-adaptivity is that the query sequence can be fixed and the distances of the input to the queries provide a non-linear embedding of the input, which can be used in downstream applications involving, e.g., neural networks for natural language processing.

Subject Classification

ACM Subject Classification
  • Theory of computation → Lower bounds and information complexity
  • Theory of computation → Parameterized complexity and exact algorithms
  • Theory of computation → Algorithm design techniques
Keywords
  • Sequence Recovery
  • Edit Distance
  • DTW Distance
  • Fréchet Distance

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Amir Abboud, Arturs Backurs, and Virginia Vassilevska Williams. Tight hardness results for LCS and other sequence similarity measures. In 2015 IEEE 56th Annual Symposium on Foundations of Computer Science, pages 59-78. IEEE, 2015. Google Scholar
  2. Peyman Afshani, Manindra Agrawal, Benjamin Doerr, Carola Doerr, Kasper Green Larsen, and Kurt Mehlhorn. The query complexity of a permutation-based variant of Mastermind. Discrete Applied Mathematics, 260:28-50, 2019. Google Scholar
  3. Pankaj K Agarwal, Rinat Ben Avraham, Haim Kaplan, and Micha Sharir. Computing the discrete fréchet distance in subquadratic time. SIAM Journal on Computing, 43(2):429-449, 2014. Google Scholar
  4. M Aldridge, O Johnson, and J Scarlett. Group testing: An information theory perspective. Foundations and Trends in Communications and Information Theory, 15(3-4):196-392, 2019. Google Scholar
  5. A Andoni, M Deza, A Gupta, P Indyk, and S Raskhodnikova. Lower bounds for embedding edit distance into normed spaces. In Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms, pages 523-526, 2003. Google Scholar
  6. Alexandr Andoni, Robert Krauthgamer, and Krzysztof Onak. Polylogarithmic approximation for edit distance and the asymmetric query complexity. In 2010 IEEE 51st Annual Symposium on Foundations of Computer Science, pages 377-386. IEEE, 2010. Google Scholar
  7. Boris Aronov, Sariel Har-Peled, Christian Knauer, Yusu Wang, and Carola Wenk. Fréchet distance for curves, revisited. In European symposium on algorithms, pages 52-63. Springer, 2006. Google Scholar
  8. Djamal Belazzougui and Qin Zhang. Edit distance: Sketching, streaming, and document exchange. In 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS), pages 51-60. IEEE, 2016. Google Scholar
  9. Vladimir Braverman, Moses Charikar, William Kuszmaul, David P Woodruff, and Lin F Yang. The one-way communication complexity of dynamic time warping distance. arXiv preprint, 2019. URL: http://arxiv.org/abs/1903.03520.
  10. Nader H Bshouty. Optimal algorithms for the coin weighing problem with a spring scale. In COLT, volume 2009, page 82, 2009. Google Scholar
  11. Maike Buchin, Anne Driemel, Koen van Greevenbroek, Ioannis Psarros, and Dennis Rohde. Approximating length-restricted means under dynamic time warping. arXiv preprint, 2021. URL: http://arxiv.org/abs/2112.00408.
  12. Xingyu Cai, Tingyang Xu, Jinfeng Yi, Junzhou Huang, and Sanguthevar Rajasekaran. DTWNet: a dynamic time warping network. Advances in neural information processing systems, 32, 2019. Google Scholar
  13. David G Cantor and WH Mills. Determination of a subset from certain combinatorial properties. Canadian Journal of Mathematics, 18:42-48, 1966. Google Scholar
  14. Diptarka Chakraborty, Elazar Goldenberg, and Michal Kouckỳ. Streaming algorithms for embedding and computing edit distance in the low distance regime. In Proceedings of the forty-eighth annual ACM symposium on Theory of Computing, pages 712-725, 2016. Google Scholar
  15. Moses Charikar, Ofir Geri, Michael P Kim, and William Kuszmaul. On estimating edit distance: Alignment, dimension reduction, and embeddings. In ICALP, 2018. Google Scholar
  16. Moses Charikar and Robert Krauthgamer. Embedding the Ulam metric into l₁. Theory of Computing, 2(1):207-224, 2006. Google Scholar
  17. Lei Chen and Raymond Ng. On the marriage of l_p-norms and edit distance. In Proceedings of the Thirtieth international conference on Very large data bases-Volume 30, pages 792-803, 2004. Google Scholar
  18. Jeremy M Cohen, Elan Rosenfeld, and J Zico Kolter. Certified adversarial robustness via randomized smoothing. arXiv preprint, 2019. URL: http://arxiv.org/abs/1902.02918.
  19. Amin Coja-Oghlan, Oliver Gebhard, Max Hahn-Klimroth, and Philipp Loick. Optimal group testing. In Conference on Learning Theory, pages 1374-1388. PMLR, 2020. Google Scholar
  20. Graham Cormode. Sequence distance embeddings. PhD thesis, Department of Computer Science, 2003. Google Scholar
  21. Robert Dorfman. The detection of defective members of large populations. The Annals of Mathematical Statistics, 14(4):436-440, 1943. Google Scholar
  22. Manuel Fernández, David P Woodruff, and Taisuke Yasuda. The query complexity of mastermind with l_p distances. Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, 2019. Google Scholar
  23. Zhuangfei Hu, Xinda Li, David P Woodruff, Hongyang Zhang, and Shufan Zhang. Recovery from non-decomposable distance oracles. arXiv preprint, 2022. URL: http://arxiv.org/abs/2209.05676.
  24. Zilin Jiang and Nikita Polyanskii. On the metric dimension of cartesian powers of a graph. Journal of Combinatorial Theory, Series A, 165:1-14, 2019. Google Scholar
  25. Subhash Khot and Assaf Naor. Nonembeddability theorems via Fourier analysis. In 46th Annual IEEE Symposium on Foundations of Computer Science (FOCS'05), pages 101-110. IEEE, 2005. Google Scholar
  26. Donald E Knuth. The computer as Master Mind. Journal of Recreational Mathematics, 9(1):1-6, 1976. Google Scholar
  27. Robert Krauthgamer and Yuval Rabani. Improved lower bounds for embeddings into l₁. SIAM Journal on Computing, 38(6):2487-2498, 2009. Google Scholar
  28. Ilan Kremer, Noam Nisan, and Dana Ron. On randomized one-round communication complexity. In Frank Thomson Leighton and Allan Borodin, editors, Proceedings of the Twenty-Seventh Annual ACM Symposium on Theory of Computing, 29 May-1 June 1995, Las Vegas, Nevada, USA, pages 596-605. ACM, 1995. URL: https://doi.org/10.1145/225058.225277.
  29. Mathias Lecuyer, Vaggelis Atlidakis, Roxana Geambasu, Daniel Hsu, and Suman Jana. Certified robustness to adversarial examples with differential privacy. In IEEE Symposium on Security and Privacy, pages 656-672, 2019. Google Scholar
  30. Vladimir I Levenshtein. Binary codes capable of correcting deletions, insertions, and reversals. In Soviet physics doklady, volume 10, pages 707-710, 1966. Google Scholar
  31. Ming Li and Paul MB Vitányi. Combinatorics and Kolmogorov complexity. In 1991 Proceedings of the Sixth Annual Structure in Complexity Theory Conference, pages 154-155. IEEE Computer Society, 1991. Google Scholar
  32. Juan A Rodríguez-Velázquez, Ismael G Yero, Dorota Kuziak, and Ortrud R Oellermann. On the strong metric dimension of Cartesian and direct products of graphs. Discrete Mathematics, 335:8-19, 2014. Google Scholar
  33. Nathan Schaar, Vincent Froese, and Rolf Niedermeier. Faster binary mean computation under dynamic time warping. arXiv preprint, 2020. URL: http://arxiv.org/abs/2002.01178.
  34. Harold S Shapiro and NJ Fine. E1399. The American Mathematical Monthly, 67(7):697-698, 1960. Google Scholar
  35. Staffan Söderberg and Harold S Shapiro. A combinatory detection problem. The American Mathematical Monthly, 70(10):1066-1070, 1963. Google Scholar
  36. Roman Vershynin. Lectures in geometric functional analysis. Unpublished manuscript. Available at https://www.math.uci.edu/~rvershyn/papers/GFA-book.pdf, 3(3):3-3, 2011.
  37. Chao Wang, Qing Zhao, and Chen-Nee Chuah. Optimal nested test plan for combinatorial quantitative group testing. IEEE Transactions on Signal Processing, 66(4):992-1006, 2017. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail