The Complexity of Problems in P Given Correlated Instances

Goldwasser, Shafi; Holden, Dhiraj

doi:10.4230/LIPIcs.ITCS.2017.13

Abstract

Instances of computational problems  do not exist in isolation. Rather, multiple and correlated instances of the same problem arise naturally in the real world. The challenge is how to gain  computationally from correlations when they can be found. 
[DGH, ITCS 2015] showed that significant computational gains can be made by having access to auxiliary instances which are correlated to the primary problem instance via the solution space. They demonstrate this for constraint satisfaction problems, which are NP-hard  in the general worst case form.

Here, we set out  to study the impact of having access to correlated instances on the complexity of  polynomial time problems. Namely, for a problem P that is conjectured to require time n^c for c>0, 
we ask whether access to a few instances of P that are correlated in some natural way can be used to solve P on one of them (the designated "primary instance") faster than the conjectured lower bound of n^c. 

We focus our attention on a number of problems: the Longest Common Subsequence (LCS), the minimum Edit Distance between sequences, and Dynamic Time Warping Distance (DTWD) of curves, for all of which the best known algorithms achieve O(n^2/polylog(n)) runtime via dynamic programming. These problems form an interesting case in point to study, as it has been shown that a O(n^(2 - epsilon)) time algorithm for a worst-case instance would imply  improved algorithms 
for a host of other problems as well as disprove complexity hypotheses such as the  Strong Exponential Time Hypothesis.

We show how to use access to a logarithmic number of auxiliary correlated instances, to design novel  o(n^2) time algorithms for  LCS, EDIT, DTWD, and  more generally improved algorithms for computing  any tuple-based similarity measure - a generalization which we define within  on strings. For the multiple sequence alignment problem on k strings, this yields an O(nk\log n) algorithm 
contrasting with classical O(n^k) dynamic programming. 

Our results hold for several correlation models between the primary and the auxiliary instances. In the most general correlation model we address, we assume that the primary instance is a worst-case instance and the auxiliary instances are chosen with uniform distribution subject to the constraint that their alignments  are
epsilon-correlated with the optimal alignment of the primary instance. We emphasize that optimal solutions for the auxiliary instances will not generally coincide with optimal solutions for the worst case primary instance.

We view our work as pointing out a new avenue for looking for significant improvements for sequence alignment problems and
computing similarity measures, by taking advantage of access to sequences which are correlated through natural generating processes. 
In this first work we  show how to take advantage of  mathematically inspired simple clean models of correlation - the intriguing question, looking forward, is to find correlation models which coincide with evolutionary models and other relationships and for which our approach to multiple sequence alignment gives provable guarantees.

Cite As Get BibTex

Shafi Goldwasser and Dhiraj Holden. The Complexity of Problems in P Given Correlated Instances. In 8th Innovations in Theoretical Computer Science Conference (ITCS 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 67, pp. 13:1-13:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017) https://doi.org/10.4230/LIPIcs.ITCS.2017.13

Author Details

Shafi Goldwasser

Dhiraj Holden

References

Amir Abboud, Arturs Backurs, and V. Vassilevska Williams. Tight hardness results for lcs and other sequence similarity measures. In FOCS 2015, 2015.
Josh Alman, Timothy M Chan, and Ryan Williams. Polynomial representations of threshold functions and algorithmic applications. arXiv preprint arXiv:1608.04355, 2016.
Alexandr Andoni and Robert Krauthgamer. The smoothed complexity of edit distance. ACM Transactions on Algorithms (TALG), 8(4):44, 2012.
Piotr Indyk Arturs Backurs. Edit distance cannot be computed in strongly subquadratic time (unless seth is false). In STOC15, 2015.
Karl Bringmann and Marvin Künnemann. Quadratic conditional lower bounds for string problems and dynamic time warping. In Foundations of Computer Science (FOCS), 2015 IEEE 56th Annual Symposium on, pages 79-97. IEEE, 2015.
Irit Dinur, Shafi Goldwasser, and Huijia Lin. The computational benefit of correlated instances. In Proceedings of the 2015 Conference on Innovations in Theoretical Computer Science, ITCS 2015, Rehovot, Israel, January 11-13, 2015, pages 219-228, 2015. URL: http://dx.doi.org/10.1145/2688073.2688082.
Anka Gajentaan and Mark H Overmars. On a class of o(n²) problems in computational geometry. Computational geometry, 5(3):165-185, 1995.
Nadia Heninger, Zakir Durumeric, Eric Wustrow, and J Alex Halderman. Mining your ps and qs: Detection of widespread weak keys in network devices. In Presented as part of the 21st USENIX Security Symposium (USENIX Security 12), pages 205-220, 2012.
James W Hunt and Thomas G Szymanski. A fast algorithm for computing longest common subsequences. Communications of the ACM, 20(5):350-353, 1977.
Russell Impagliazzo and Ramamohan Paturi. Complexity of k-sat. In Computational Complexity, 1999. Proceedings. Fourteenth Annual IEEE Conference on, pages 237-240. IEEE, 1999.
Michael Mitzenmacher and Eli Upfal. Probability and computing: Randomized algorithms and probabilistic analysis. Cambridge University Press, 2005.
Daniel A Spielman and Shang-Hua Teng. Smoothed analysis. In Algorithms and data structures, pages 256-270. Springer, 2003.

The Complexity of Problems in P Given Correlated Instances

Authors Shafi Goldwasser, Dhiraj Holden

File

Document Identifiers

Subject Classification

Keywords

Metrics

Abstract

Cite As Get BibTex

Author Details

References

Thanks for your feedback!

Could not send message