String Matching: Communication, Circuits, and Learning

Golovnev, Alexander; Göös, Mika; Reichman, Daniel; Shinkar, Igor

doi:10.4230/LIPIcs.APPROX-RANDOM.2019.56

Abstract

String matching is the problem of deciding whether a given n-bit string contains a given k-bit pattern. We study the complexity of this problem in three settings. - Communication complexity. For small k, we provide near-optimal upper and lower bounds on the communication complexity of string matching. For large k, our bounds leave open an exponential gap; we exhibit some evidence for the existence of a better protocol. - Circuit complexity. We present several upper and lower bounds on the size of circuits with threshold and DeMorgan gates solving the string matching problem. Similarly to the above, our bounds are near-optimal for small k. - Learning. We consider the problem of learning a hidden pattern of length at most k relative to the classifier that assigns 1 to every string that contains the pattern. We prove optimal bounds on the VC dimension and sample complexity of this problem.

Dana Angluin. Learning regular sets from queries and counterexamples. Information and computation, 75(2):87-106, 1987.
Martin Anthony and Peter L. Bartlett. Neural network learning: Theoretical foundations. Cambridge University Press, 2009.
Ziv Bar-Yossef, T. S. Jayram, Robert Krauthgamer, and Ravi Kumar. The sketching complexity of pattern matching. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, pages 261-272. Springer, 2004.
Ziv Bar-Yossef, Thathachar S Jayram, Ravi Kumar, and D Sivakumar. An information statistics approach to data stream and communication complexity. Journal of Computer and System Sciences, 68(4):702-732, 2004.
Omri Ben-Eliezer, Simon Korman, and Daniel Reichman. Deleting and testing forbidden patterns in multi-dimensional arrays. In International Proceedings in Informatics, volume 80. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2017.
Anselm Blumer, Andrzej Ehrenfeucht, David Haussler, and Manfred K Warmuth. Learnability and the Vapnik-Chervonenkis dimension. Journal of the ACM (JACM), 36(4):929-965, 1989.
Robert S. Boyer and J. Strother Moore. A fast string searching algorithm. Communications of the ACM, 20(10):762-772, 1977.
Mark Braverman and Omri Weinstein. A Discrepancy Lower Bound for Information Complexity. Algorithmica, 76(3):846-864, 2016. URL: https://doi.org/10.1007/s00453-015-0093-8.
Arkadev Chattopadhyay, Nikhil Mande, and Suhail Sherif. The Log-Approximate-Rank Conjecture is False. In Proceedings of the 51st Symposium on Theory of Computing, 2019. To appear.
Amit Daniely and Shai Shalev-Shwartz. Complexity theoretic limitations on learning DNF’s. In Conference on Learning Theory, pages 815-830, 2016.
Andrzej Ehrenfeucht, David Haussler, Michael Kearns, and Leslie Valiant. A general lower bound on the number of examples needed for learning. Information and Computation, 82(3):247-261, 1989.
Jürgen Forster, Matthias Krause, Satyanarayana V. Lokam, Rustam Mubarakzjanov, Niels Schmitt, and Hans Ulrich Simon. Relations between communication complexity, linear arrangements, and computational complexity. In International Conference on Foundations of Software Technology and Theoretical Computer Science, pages 171-182. Springer, 2001.
Yoav Freund, Michael Kearns, Dana Ron, Ronitt Rubinfeld, Robert E Schapire, and Linda Sellie. Efficient learning of typical finite automata from random walks. Information and Computation, 138(1):23-48, 1997.
Zvi Galil. Optimal parallel algorithms for string matching. Information and Control, 67(1-3):144-157, 1985.
Zvi Galil and Joel Seiferas. Time-space-optimal string matching. Journal of Computer and System Sciences, 26(3):280-294, 1983.
Hans Dietmar Groeger and György Turán. A linear lower bound for the size of threshold circuits. Bulletin-European Association For Theoretical Computer Science, 50:220-220, 1993.
András Hajnal, Wolfgang Maass, Pavel Pudlák, Mario Szegedy, and György Turán. Threshold circuits of bounded depth. Journal of Computer and System Sciences, 46(2):129-154, 1993.
Steve Hanneke. The optimal sample complexity of PAC learning. The Journal of Machine Learning Research, 17(1):1319-1333, 2016.
Johan Håstad. Computational Limitations of Small-depth Circuits. MIT Press, 1987.
Johan Håstad, Stasys Jukna, and Pavel Pudlák. Top-down lower bounds for depth-three circuits. Computational Complexity, 5(2):99-112, 1995.
Stasys Jukna. On graph complexity. Combinatorics, Probability and Computing, 15(6):855-876, 2006.
Stasys Jukna. Boolean function complexity: advances and frontiers, volume 27. Springer Science & Business Media, 2012.
Bala Kalyanasundaram and Georg Schintger. The probabilistic communication complexity of set intersection. SIAM Journal on Discrete Mathematics, 5(4):545-557, 1992.
Daniel M. Kane and Ryan Williams. Super-linear gate and super-quadratic wire lower bounds for depth-two and depth-three threshold circuits. In Proceedings of the 48th Annual ACM SIGACT Symposium on Theory of Computing, pages 633-643. ACM, 2016.
Donald E. Knuth, James H. Morris, Jr, and Vaughan R. Pratt. Fast pattern matching in strings. SIAM journal on computing, 6(2):323-350, 1977.
Eyal Kushilevitz and Noam Nisan. Communication Complexity. Cambridge University Press, 1997.
Eyal Kushilevitz and Dan Roth. On learning visual concepts and DNF formulae. Machine Learning, 24(1):65-85, 1996.
Troy Lee and Adi Shraibman. Lower Bounds in Communication Complexity, volume 3. Now Publishers, 2009. URL: https://doi.org/10.1561/0400000040.
Robert A. Legenstein and Wolfgang Maass. Foundations for a circuit complexity theory of sensory processing. Advances in neural information processing systems, pages 259-265, 2001.
Robert A. Legenstein and Wolfgang Maass. Neural circuits for pattern recognition with small total wire length. Theoretical Computer Science, 287(1):239-249, 2002.
R. C. Lyndon and M. P. Schützenberger. The equation a^M = b^Nc^P in a free group. Michigan Mathematical Journal, 9:289-298, 1962.
James Martens, Arkadev Chattopadhya, Toni Pitassi, and Richard Zemel. On the representational efficiency of restricted Boltzmann machines. In Advances in Neural Information Processing Systems, pages 2877-2885, 2013.
Saburo Muroga. Threshold logic and its application. Wily-Interscience, 1971.
Noam Nisan. The communication complexity of threshold gates. Combinatorics, Paul Erdos is Eighty, 1:301-315, 1993.
Ian Parberry. Circuit complexity and neural networks. MIT press, 1994.
Ian Parberry and Georg Schnitger. Parallel computation with threshold functions. Journal of Computer and System Sciences, 36(3):278-302, 1988.
Benny Porat and Ely Porat. Exact and approximate pattern matching in the streaming model. In Foundations of Computer Science, 2009. 50th Annual IEEE Symposium on, pages 315-323. IEEE, 2009.
Alexander A. Razborov. On small depth threshold circuits. In Scandinavian Workshop on Algorithm Theory, pages 42-52. Springer, 1992.
Alexander A. Razborov. On the distributional complexity of disjointness. Theoretical Computer Science, 106(2):385-390, 1992.
Ronald L. Rivest. On the worst-case behavior of string-searching algorithms. SIAM Journal on Computing, 6(4):669-674, 1977.
Dana Ron and Ronitt Rubinfeld. Exactly learning automata of small cover time. Machine Learning, 27(1):69-96, 1997.
Christian Rosenke. The exact complexity of projective image matching. Journal of Computer and System Sciences, 82(8):1360-1387, 2016.
Vwani P. Roychowdhury, Alon Orlitsky, and Kai-Yeung Siu. Lower bounds on threshold and related circuits via communication complexity. IEEE Transactions on Information Theory, 40(2):467-474, 1994.
Shai Shalev-Shwartz and Shai Ben-David. Understanding machine learning: From theory to algorithms. Cambridge university press, 2014.
Haim Shvaytser. Learnable and nonlearnable visual concepts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 12(5):459-466, 1990.
Kai-Yeung Siu and Jehoshua Bruck. On the power of threshold circuits with small weights. SIAM Journal on Discrete Mathematics, 4(3):423-435, 1991.
Kai-Yeung Siu, Jehoshua Bruck, Thomas Kailath, and Thomas Hofmeister. Depth efficient neural networks for division and related problems. IEEE Transactions on information theory, 39(3):946-956, 1993.
Kei Uchizawa, Daiki Yashima, and Xiao Zhou. Threshold Circuits for Global Patterns in 2-Dimensional Maps. In International Workshop on Algorithms and Computation, pages 306-316. Springer, 2015.
Leslie G. Valiant. A theory of the learnable. Communications of the ACM, 27(11):1134-1142, 1984.
Thomas Watson. Communication Complexity of Statistical Distance. ACM Transactions on Computation Theory, 10(1):2:1-2:11, 2018. URL: https://doi.org/10.1145/3170708.
Mihalis Yannakakis. Expressing combinatorial optimization problems by Linear Programs. Journal of Computer and System Sciences, 43(3):441-466, 1991. URL: https://doi.org/10.1016/0022-0000(91)90024-Y.

String Matching: Communication, Circuits, and Learning

Authors Alexander Golovnev, Mika Göös, Daniel Reichman, Igor Shinkar

File

Document Identifiers

Author Details

Acknowledgements

Cite AsGet BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

References

String Matching: Communication, Circuits, and Learning

Authors Alexander Golovnev, Mika Göös, Daniel Reichman, Igor Shinkar

File

Document Identifiers

Author Details

Funding

Acknowledgements

Cite AsGet BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

Related Versions

References