Internal Pattern Matching in Small Space and Applications

Authors Gabriel Bathie , Panagiotis Charalampopoulos , Tatiana Starikovskaya

Author Details

Gabriel Bathie
  • DIENS, École normale supérieure de Paris, PSL Research University, France
  • LaBRI, Université de Bordeaux, France
Panagiotis Charalampopoulos
  • Birkbeck, University of London, UK
Tatiana Starikovskaya
  • DIENS, École normale supérieure de Paris, PSL Research University, France


We would like to dedicate this work to our dear friend and colleague Paweł Gawrychowski on the occasion of his 40th birthday. We thank Solon Pissis for helpful suggestions.

Gabriel Bathie, Panagiotis Charalampopoulos, and Tatiana Starikovskaya. Internal Pattern Matching in Small Space and Applications. In 35th Annual Symposium on Combinatorial Pattern Matching (CPM 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 296, pp. 4:1-4:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


In this work, we consider pattern matching variants in small space, that is, in the read-only setting, where we want to bound the space usage on top of storing the strings. Our main contribution is a space-time trade-off for the Internal Pattern Matching (IPM) problem, where the goal is to construct a data structure over a string S of length n that allows one to answer the following type of queries: Compute the occurrences of a fragment P of S inside another fragment T of S, provided that |T| < 2|P|. For any τ ∈ [1 . . n/log² n], we present a nearly-optimal Õ(n/τ)-size data structure that can be built in Õ(n) time using Õ(n/τ) extra space, and answers IPM queries in O(τ+log n log³ log n) time. IPM queries have been identified as a crucial primitive operation for the analysis of algorithms on strings. In particular, the complexities of several recent algorithms for approximate pattern matching are expressed with regards to the number of calls to a small set of primitive operations that include IPM queries; our data structure allows us to port these results to the small-space setting. We further showcase the applicability of our IPM data structure by using it to obtain space-time trade-offs for the longest common substring and circular pattern matching problems in the asymmetric streaming setting.

  • Theory of computation → Pattern matching
  • internal pattern matching
  • longest common substring
  • small-space algorithms


