Efficient Exact Online String Matching Through Linked Weak Factors

Authors Matthew N. Palmer, Simone Faro , Stefano Scafiti

Author Details

Matthew N. Palmer
  • The British Computer Society, Swindon, United Kingdom
Simone Faro
  • Department of Mathematics and Computer Science, University of Catania, Italy
Stefano Scafiti
  • Department of Mathematics and Computer Science, University of Catania, Italy

Matthew N. Palmer, Simone Faro, and Stefano Scafiti. Efficient Exact Online String Matching Through Linked Weak Factors. In 22nd International Symposium on Experimental Algorithms (SEA 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 301, pp. 24:1-24:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Online exact string matching is a fundamental computational problem in computer science, involving the sequential search for a pattern within a large text without prior access to the entire text. Its significance is underscored by its diverse applications in data compression, data mining, text editing, and bioinformatics, just to cite a few, where efficient substring matching is crucial. While the problem has been a subject of study for years, recent decades have witnessed a heightened focus on experimental solutions, employing various techniques to achieve superior performance. Notably, approaches centered around weak factor recognition have emerged as leaders in experimental settings, gaining increasing attention. This paper introduces Hash Chain, a novel algorithm founded on a robust weak factor recognition approach that links adjacent factors through hashing. Building upon the efficacy of weak recognition techniques, the proposed algorithm incorporates innovative strategies for organizing data structures and optimizations to enhance performance. Despite its quadratic worst-case time complexity, the new proposed algorithm demonstrates sublinear behavior in practice, outperforming currently known algorithms in the literature.

Subject Classification

ACM Subject Classification
  • Theory of computation → Bloom filters and hashing
  • Theory of computation → Pattern matching
  • String matching
  • text processing
  • weak recognition
  • hashing
  • experimental algorithms
  • design and analysis of algorithms


