The Complexity of Problems in P Given Correlated Instances

Goldwasser, Shafi; Holden, Dhiraj

doi:10.4230/LIPIcs.ITCS.2017.13

File

LIPIcs.ITCS.2017.13.pdf

Filesize: 0.55 MB
19 pages

Document Identifiers

DOI: 10.4230/LIPIcs.ITCS.2017.13
URN: urn:nbn:de:0030-drops-81753

Author Details

Shafi Goldwasser

Dhiraj Holden

Cite AsGet BibTex

Shafi Goldwasser and Dhiraj Holden. The Complexity of Problems in P Given Correlated Instances. In 8th Innovations in Theoretical Computer Science Conference (ITCS 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 67, pp. 13:1-13:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)
https://doi.org/10.4230/LIPIcs.ITCS.2017.13

Abstract

Instances of computational problems do not exist in isolation. Rather, multiple and correlated instances of the same problem arise naturally in the real world. The challenge is how to gain computationally from correlations when they can be found. [DGH, ITCS 2015] showed that significant computational gains can be made by having access to auxiliary instances which are correlated to the primary problem instance via the solution space. They demonstrate this for constraint satisfaction problems, which are NP-hard in the general worst case form. Here, we set out to study the impact of having access to correlated instances on the complexity of polynomial time problems. Namely, for a problem P that is conjectured to require time n^c for c>0, we ask whether access to a few instances of P that are correlated in some natural way can be used to solve P on one of them (the designated "primary instance") faster than the conjectured lower bound of n^c. We focus our attention on a number of problems: the Longest Common Subsequence (LCS), the minimum Edit Distance between sequences, and Dynamic Time Warping Distance (DTWD) of curves, for all of which the best known algorithms achieve O(n^2/polylog(n)) runtime via dynamic programming. These problems form an interesting case in point to study, as it has been shown that a O(n^(2 - epsilon)) time algorithm for a worst-case instance would imply improved algorithms for a host of other problems as well as disprove complexity hypotheses such as the Strong Exponential Time Hypothesis. We show how to use access to a logarithmic number of auxiliary correlated instances, to design novel o(n^2) time algorithms for LCS, EDIT, DTWD, and more generally improved algorithms for computing any tuple-based similarity measure - a generalization which we define within on strings. For the multiple sequence alignment problem on k strings, this yields an O(nk\log n) algorithm contrasting with classical O(n^k) dynamic programming. Our results hold for several correlation models between the primary and the auxiliary instances. In the most general correlation model we address, we assume that the primary instance is a worst-case instance and the auxiliary instances are chosen with uniform distribution subject to the constraint that their alignments are epsilon-correlated with the optimal alignment of the primary instance. We emphasize that optimal solutions for the auxiliary instances will not generally coincide with optimal solutions for the worst case primary instance. We view our work as pointing out a new avenue for looking for significant improvements for sequence alignment problems and computing similarity measures, by taking advantage of access to sequences which are correlated through natural generating processes. In this first work we show how to take advantage of mathematically inspired simple clean models of correlation - the intriguing question, looking forward, is to find correlation models which coincide with evolutionary models and other relationships and for which our approach to multiple sequence alignment gives provable guarantees.

Keywords

Correlated instances
Longest Common Subsequence
Fine-grained complexity

Metrics

Access Statistics
Total Accesses (updated on a weekly basis)

0

PDF Downloads

0

Metadata Views

References

Amir Abboud, Arturs Backurs, and V. Vassilevska Williams. Tight hardness results for lcs and other sequence similarity measures. In FOCS 2015, 2015.
Josh Alman, Timothy M Chan, and Ryan Williams. Polynomial representations of threshold functions and algorithmic applications. arXiv preprint arXiv:1608.04355, 2016.
Alexandr Andoni and Robert Krauthgamer. The smoothed complexity of edit distance. ACM Transactions on Algorithms (TALG), 8(4):44, 2012.
Piotr Indyk Arturs Backurs. Edit distance cannot be computed in strongly subquadratic time (unless seth is false). In STOC15, 2015.
Karl Bringmann and Marvin Künnemann. Quadratic conditional lower bounds for string problems and dynamic time warping. In Foundations of Computer Science (FOCS), 2015 IEEE 56th Annual Symposium on, pages 79-97. IEEE, 2015.
Irit Dinur, Shafi Goldwasser, and Huijia Lin. The computational benefit of correlated instances. In Proceedings of the 2015 Conference on Innovations in Theoretical Computer Science, ITCS 2015, Rehovot, Israel, January 11-13, 2015, pages 219-228, 2015. URL: http://dx.doi.org/10.1145/2688073.2688082.
Anka Gajentaan and Mark H Overmars. On a class of o(n²) problems in computational geometry. Computational geometry, 5(3):165-185, 1995.
Nadia Heninger, Zakir Durumeric, Eric Wustrow, and J Alex Halderman. Mining your ps and qs: Detection of widespread weak keys in network devices. In Presented as part of the 21st USENIX Security Symposium (USENIX Security 12), pages 205-220, 2012.
James W Hunt and Thomas G Szymanski. A fast algorithm for computing longest common subsequences. Communications of the ACM, 20(5):350-353, 1977.
Russell Impagliazzo and Ramamohan Paturi. Complexity of k-sat. In Computational Complexity, 1999. Proceedings. Fourteenth Annual IEEE Conference on, pages 237-240. IEEE, 1999.
Michael Mitzenmacher and Eli Upfal. Probability and computing: Randomized algorithms and probabilistic analysis. Cambridge University Press, 2005.
Daniel A Spielman and Shang-Hua Teng. Smoothed analysis. In Algorithms and data structures, pages 256-270. Springer, 2003.