DROPS

Document

DOI: 10.4230/LIPIcs.CPM.2026.35

Efficient Index for Square Pattern Matching

Authors: Po-Chun Chen, Che-Wei Tsao, Wing-Kai Hon, and Dominik Köppl

Published in: LIPIcs, Volume 369, 37th Annual Symposium on Combinatorial Pattern Matching (CPM 2026)

Abstract

A string S is called a square if it can be written as the concatenation of two identical strings. Two strings P and Q of the same length are said to square match if, for every substring of P, it is a square if and only if the corresponding substring of Q is also a square. The square pattern matching problem asks for locating all substrings of a given text T of length n that square match a query pattern P of length m. This notion captures similarity in repetition structures and is motivated by applications in areas such as bioinformatics and music structure analysis. In this paper, we introduce a novel technique, called the longest prefix square (LPS) encoding, which represents the square structure of a string as an integer array of the same length. We show that two strings square match if and only if they have identical LPS encodings. Based on this result, we construct an index solving the square pattern matching problem in time O(m lg m + occ) using O(nlg²n) bits of space, where occ denotes the number of occurrences of substrings in T that square match P. If the LPS encoding of P is precomputed, the query time improves to O(m + occ).

Cite as

Po-Chun Chen, Che-Wei Tsao, Wing-Kai Hon, and Dominik Köppl. Efficient Index for Square Pattern Matching. In 37th Annual Symposium on Combinatorial Pattern Matching (CPM 2026). Leibniz International Proceedings in Informatics (LIPIcs), Volume 369, pp. 35:1-35:12, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2026)

Copy BibTex To Clipboard

@InProceedings{chen_et_al:LIPIcs.CPM.2026.35,
  author =	{Chen, Po-Chun and Tsao, Che-Wei and Hon, Wing-Kai and K\"{o}ppl, Dominik},
  title =	{{Efficient Index for Square Pattern Matching}},
  booktitle =	{37th Annual Symposium on Combinatorial Pattern Matching (CPM 2026)},
  pages =	{35:1--35:12},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-420-8},
  ISSN =	{1868-8969},
  year =	{2026},
  volume =	{369},
  editor =	{Bille, Philip and Prezza, Nicola},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2026.35},
  URN =		{urn:nbn:de:0030-drops-259617},
  doi =		{10.4230/LIPIcs.CPM.2026.35},
  annote =	{Keywords: string algorithms, pattern matching, indexing, squares}
}

Document

DOI: 10.4230/LIPIcs.CPM.2026.36

Set Parameterized Matching via Multi-Layer Hashing

Authors: Moshe Lewenstein and Ely Porat

Published in: LIPIcs, Volume 369, 37th Annual Symposium on Combinatorial Pattern Matching (CPM 2026)

Abstract

We study the set parameterized matching problem, a generalization of the classical parameterized matching problem introduced by Baker [Baker, 1993; Baker, 1997]. In set parameterized matching, both the pattern and text are sequences where each position contains a set of characters rather than a single character. Two set-strings parameterized match if there exists a bijection between their alphabets that maps one to the other set-wise. Boussidan [Aaron Boussidan, 2025] introduced this problem for the case of equal-length set-strings. We present a randomized algorithm running in O(N + M) time with high probability, where N is the text size and M is the pattern size. Our approach employs a novel three-layer hashing scheme based on Karp-Rabin fingerprinting that addresses the challenges of (1) the size blowup in representations of the problem, (2) set-to-set matching, and (3) the dynamic nature of encodings of text substrings during pattern scanning.

Cite as

Moshe Lewenstein and Ely Porat. Set Parameterized Matching via Multi-Layer Hashing. In 37th Annual Symposium on Combinatorial Pattern Matching (CPM 2026). Leibniz International Proceedings in Informatics (LIPIcs), Volume 369, pp. 36:1-36:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2026)

Copy BibTex To Clipboard

@InProceedings{lewenstein_et_al:LIPIcs.CPM.2026.36,
  author =	{Lewenstein, Moshe and Porat, Ely},
  title =	{{Set Parameterized Matching via Multi-Layer Hashing}},
  booktitle =	{37th Annual Symposium on Combinatorial Pattern Matching (CPM 2026)},
  pages =	{36:1--36:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-420-8},
  ISSN =	{1868-8969},
  year =	{2026},
  volume =	{369},
  editor =	{Bille, Philip and Prezza, Nicola},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2026.36},
  URN =		{urn:nbn:de:0030-drops-259620},
  doi =		{10.4230/LIPIcs.CPM.2026.36},
  annote =	{Keywords: Set Parameterized Matching, Pattern Matching, Randomized Algorithms, Hashing, Parameterized Matching}
}

Document

DOI: 10.4230/LIPIcs.CPM.2026.1

Hamming Distance Oracles

Authors: Itai Boneh, Dvir Fried, Shay Golan, Matan Kraus, and Ely Porat

Published in: LIPIcs, Volume 369, 37th Annual Symposium on Combinatorial Pattern Matching (CPM 2026)

Abstract

In this paper, we present and study the Hamming distance oracle problem. In this problem, the task is to preprocess two strings S and T of lengths n and m, respectively, to obtain a data structure that is able to return the Hamming distance between a substring of S and a substring of T. For strings over a constant-size alphabet, we show that for every x ≤ min{n,m} there is a data structure with Õ(nm/x) preprocessing time and O(x) query time. We also provide a conditional lower bound, showing that for every ε > 0 there is no combinatorial data structure with query time O(x) and preprocessing time O((nm/x)^{1-ε}) unless combinatorial fast matrix multiplication is possible. For strings over a general alphabet, we present a data structure with Õ(nm/√x) pre-processing time and O(x) query time for every x ≤ min {n,m}. Moreover, for every ε > 0 we provide a data structure with a preprocessing time of Õ((n+m)/ε³) that returns with high probability a (1±ε) approximation of the Hamming distance of two input substrings. The query time of the approximation data structure is Õ(1/ε²).

Cite as

Itai Boneh, Dvir Fried, Shay Golan, Matan Kraus, and Ely Porat. Hamming Distance Oracles. In 37th Annual Symposium on Combinatorial Pattern Matching (CPM 2026). Leibniz International Proceedings in Informatics (LIPIcs), Volume 369, pp. 1:1-1:12, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2026)

Copy BibTex To Clipboard

@InProceedings{boneh_et_al:LIPIcs.CPM.2026.1,
  author =	{Boneh, Itai and Fried, Dvir and Golan, Shay and Kraus, Matan and Porat, Ely},
  title =	{{Hamming Distance Oracles}},
  booktitle =	{37th Annual Symposium on Combinatorial Pattern Matching (CPM 2026)},
  pages =	{1:1--1:12},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-420-8},
  ISSN =	{1868-8969},
  year =	{2026},
  volume =	{369},
  editor =	{Bille, Philip and Prezza, Nicola},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2026.1},
  URN =		{urn:nbn:de:0030-drops-259278},
  doi =		{10.4230/LIPIcs.CPM.2026.1},
  annote =	{Keywords: Hamming distance, Fine-grained complexity, Data structure, Oracle}
}

Document

DOI: 10.4230/LIPIcs.CPM.2026.2

Near-Real-Time Solutions for Online String Problems

Authors: Dominik Köppl and Gregory Kucherov

Published in: LIPIcs, Volume 369, 37th Annual Symposium on Combinatorial Pattern Matching (CPM 2026)

Abstract

Based on the Breslauer-Italiano online suffix tree construction algorithm (2013) with double logarithmic worst-case guarantees on the update time per letter, we develop near-real-time algorithms for several classical problems on strings, including the computation of the longest repeating suffix array, the (reversed) Lempel-Ziv 77 factorization, and the maintenance of minimal unique substrings, all in an online manner. Our solutions improve over the best known running times for these problems in terms of the worst-case time per letter, for which we achieve a poly-log-logarithmic time complexity, within a linear space. Best known results for these problems require a poly-logarithmic time complexity per letter or only provide amortized complexity bounds. As a result of independent interest, we give conversions between the longest previous factor array and the longest repeating suffix array in space and time bounds based on their irreducible representations, which can have sizes sublinear in the length of the input string.

Cite as

Dominik Köppl and Gregory Kucherov. Near-Real-Time Solutions for Online String Problems. In 37th Annual Symposium on Combinatorial Pattern Matching (CPM 2026). Leibniz International Proceedings in Informatics (LIPIcs), Volume 369, pp. 2:1-2:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2026)

Copy BibTex To Clipboard

@InProceedings{koppl_et_al:LIPIcs.CPM.2026.2,
  author =	{K\"{o}ppl, Dominik and Kucherov, Gregory},
  title =	{{Near-Real-Time Solutions for Online String Problems}},
  booktitle =	{37th Annual Symposium on Combinatorial Pattern Matching (CPM 2026)},
  pages =	{2:1--2:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-420-8},
  ISSN =	{1868-8969},
  year =	{2026},
  volume =	{369},
  editor =	{Bille, Philip and Prezza, Nicola},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2026.2},
  URN =		{urn:nbn:de:0030-drops-259287},
  doi =		{10.4230/LIPIcs.CPM.2026.2},
  annote =	{Keywords: online algorithms, string algorithms, suffix tree, real-time computation, Lempel-Ziv factorization, minimal unique substrings}
}

Document

DOI: 10.4230/LIPIcs.CPM.2026.3

Computing k-mers in Graphs

Authors: Jarno N. Alanko and Máximo Pérez-López

Published in: LIPIcs, Volume 369, 37th Annual Symposium on Combinatorial Pattern Matching (CPM 2026)

Abstract

We initiate the study of computational problems on k-mers (strings of length k) in labeled graphs. As a starting point, we consider the problem of counting the number of distinct k-mers found on the walks of a graph. We establish that this is #P-hard, even on connected deterministic DAGs. However, in the class of deterministic Wheeler graphs (Gagie, Manzini, and Sirén, TCS 2017), we show that distinct k-mers of such a graph W = (V, E) can be counted using O(|W|k) or O(n⁴ log k) arithmetic operations, where n = |V|, m = |E| and |W| = n+m. The latter result uses a new generalization of the technique of prefix doubling to Wheeler graphs. To generalize our results beyond Wheeler graphs, we discuss ways to transform a graph into a Wheeler graph in a manner that preserves the k-mers. As an application of our k-mer counting algorithms, we construct a representation of the de Bruijn graph of the k-mers that occupies O(n_k + |W|k log(max_{1 ≤ 𝓁 ≤ k} n_𝓁) + σlog m) bits of space, where n_𝓁 is the number of distinct 𝓁-mers in the Wheeler graph, and σ is the size of the alphabet. We show how to construct it in the same time complexity. Given that the Wheeler graph can be exponentially smaller than the de Bruijn graph, for large k this provides a theoretical improvement over previous de Bruijn graph construction methods from graphs, which must spend Ω(k) time per k-mer in the graph.

Cite as

Jarno N. Alanko and Máximo Pérez-López. Computing k-mers in Graphs. In 37th Annual Symposium on Combinatorial Pattern Matching (CPM 2026). Leibniz International Proceedings in Informatics (LIPIcs), Volume 369, pp. 3:1-3:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2026)

Copy BibTex To Clipboard

@InProceedings{alanko_et_al:LIPIcs.CPM.2026.3,
  author =	{Alanko, Jarno N. and P\'{e}rez-L\'{o}pez, M\'{a}ximo},
  title =	{{Computing k-mers in Graphs}},
  booktitle =	{37th Annual Symposium on Combinatorial Pattern Matching (CPM 2026)},
  pages =	{3:1--3:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-420-8},
  ISSN =	{1868-8969},
  year =	{2026},
  volume =	{369},
  editor =	{Bille, Philip and Prezza, Nicola},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2026.3},
  URN =		{urn:nbn:de:0030-drops-259294},
  doi =		{10.4230/LIPIcs.CPM.2026.3},
  annote =	{Keywords: Wheeler graph, Wheeler language, de Bruijn graph, graph, k-mer, q-gram, DFA, #P-hard}
}

Document

DOI: 10.4230/LIPIcs.CPM.2026.4

Compact Representation of Maximal Palindromes

Authors: Takuya Mieno

Published in: LIPIcs, Volume 369, 37th Annual Symposium on Combinatorial Pattern Matching (CPM 2026)

Abstract

Palindromes are strings that read the same forward and backward. The computation of palindromic structures within strings is a fundamental problem in string algorithms, being motivated by potential applications in formal language theory and bioinformatics. Although the number of palindromic factors in a string of length n can be quadratic, they can be implicitly represented in O(n log n) bits of space by storing the lengths of all maximal palindromes in an integer array, which can be computed in O(n) time [Manacher, 1975]. In this paper, we propose a novel O(n)-bit representation of all maximal palindromes in a string, which enables O(1)-time retrieval of the length of the maximal palindrome centered at any given position. The data structure can be constructed in O(n) time from the input string of length n. Since Manacher’s algorithm and the notion of maximal palindromes are widely utilized for solving numerous problems involving palindromic structures, our compact representation will accelerate the development of more space-efficient solutions to such problems. Indeed, as the first application of our compact representation of maximal palindromes, we present a data structure of size O(n) bits that can compute the longest palindrome appearing in any given factor of a string of length n in O(log n) time.

Cite as

Takuya Mieno. Compact Representation of Maximal Palindromes. In 37th Annual Symposium on Combinatorial Pattern Matching (CPM 2026). Leibniz International Proceedings in Informatics (LIPIcs), Volume 369, pp. 4:1-4:12, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2026)

Copy BibTex To Clipboard

@InProceedings{mieno:LIPIcs.CPM.2026.4,
  author =	{Mieno, Takuya},
  title =	{{Compact Representation of Maximal Palindromes}},
  booktitle =	{37th Annual Symposium on Combinatorial Pattern Matching (CPM 2026)},
  pages =	{4:1--4:12},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-420-8},
  ISSN =	{1868-8969},
  year =	{2026},
  volume =	{369},
  editor =	{Bille, Philip and Prezza, Nicola},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2026.4},
  URN =		{urn:nbn:de:0030-drops-259304},
  doi =		{10.4230/LIPIcs.CPM.2026.4},
  annote =	{Keywords: palindromes, succinct data structures, internal queries}
}

Document

DOI: 10.4230/LIPIcs.CPM.2026.7

Optimal Structure for Prefix-Substring Queries

Authors: Paweł Gawrychowski, Florin Manea, and Jonas Richardsen

Published in: LIPIcs, Volume 369, 37th Annual Symposium on Combinatorial Pattern Matching (CPM 2026)

Abstract

The prefix-substring matching problem [Gu, Farach, and Beigel, SODA 1994] consists in preprocessing a string s of length n for the following queries: given a triple (i, j, k) ∈ {0, … , |s|}³ with 1 ≤ j ≤ k, representing a prefix s[1:i] and a substring s[j:k] of s, find the longest prefix of s that is a suffix of s[1:i]s[j:k]. This is an useful primitive in e.g. dynamic text indexing, compressed pattern matching, and pattern matching on block graphs. The border tree uses some basic periodicity properties to answer such queries in 𝒪(log n) time after 𝒪(n) time preprocessing of s. We design a linear-space structure that answers such queries in constant time after 𝒪(n) time preprocessing of s over a polynomial alphabet, which is worst-case optimal.

Cite as

Paweł Gawrychowski, Florin Manea, and Jonas Richardsen. Optimal Structure for Prefix-Substring Queries. In 37th Annual Symposium on Combinatorial Pattern Matching (CPM 2026). Leibniz International Proceedings in Informatics (LIPIcs), Volume 369, pp. 7:1-7:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2026)

Copy BibTex To Clipboard

@InProceedings{gawrychowski_et_al:LIPIcs.CPM.2026.7,
  author =	{Gawrychowski, Pawe{\l} and Manea, Florin and Richardsen, Jonas},
  title =	{{Optimal Structure for Prefix-Substring Queries}},
  booktitle =	{37th Annual Symposium on Combinatorial Pattern Matching (CPM 2026)},
  pages =	{7:1--7:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-420-8},
  ISSN =	{1868-8969},
  year =	{2026},
  volume =	{369},
  editor =	{Bille, Philip and Prezza, Nicola},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2026.7},
  URN =		{urn:nbn:de:0030-drops-259333},
  doi =		{10.4230/LIPIcs.CPM.2026.7},
  annote =	{Keywords: Border Tree, Prefix-Substring Query, Data Structures}
}

Document

DOI: 10.4230/LIPIcs.CPM.2026.18

On Time-Memory Tradeoffs for Maximal Palindromes with Wildcards and k-Mismatches

Authors: Amihood Amir, Ayelet Butman, Michael Itzhaki, and Dina Sokol

Published in: LIPIcs, Volume 369, 37th Annual Symposium on Combinatorial Pattern Matching (CPM 2026)

Abstract

This paper addresses the problem of identifying palindromic factors in texts that include wildcards - special characters that match all others. These symbols challenge many classical algorithms, as numerous combinatorial properties are not satisfied in their presence. We apply existing wildcard-LCE techniques to obtain a continuous time-memory tradeoff, and present the first non-trivial linear-space algorithm for computing all maximal palindromes with wildcards, improving the best known time-memory product in certain parameter ranges. Our main results are algorithms to find and approximate all maximal palindromes in a given text. We also generalize both methods to the k-mismatches setting, with or without wildcards.

Cite as

Amihood Amir, Ayelet Butman, Michael Itzhaki, and Dina Sokol. On Time-Memory Tradeoffs for Maximal Palindromes with Wildcards and k-Mismatches. In 37th Annual Symposium on Combinatorial Pattern Matching (CPM 2026). Leibniz International Proceedings in Informatics (LIPIcs), Volume 369, pp. 18:1-18:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2026)

Copy BibTex To Clipboard

@InProceedings{amir_et_al:LIPIcs.CPM.2026.18,
  author =	{Amir, Amihood and Butman, Ayelet and Itzhaki, Michael and Sokol, Dina},
  title =	{{On Time-Memory Tradeoffs for Maximal Palindromes with Wildcards and k-Mismatches}},
  booktitle =	{37th Annual Symposium on Combinatorial Pattern Matching (CPM 2026)},
  pages =	{18:1--18:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-420-8},
  ISSN =	{1868-8969},
  year =	{2026},
  volume =	{369},
  editor =	{Bille, Philip and Prezza, Nicola},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2026.18},
  URN =		{urn:nbn:de:0030-drops-259444},
  doi =		{10.4230/LIPIcs.CPM.2026.18},
  annote =	{Keywords: Wildcards, Mismatches, Palindrome}
}

Document

DOI: 10.4230/LIPIcs.CPM.2026.20

Longest Common Extension of a Dynamic String in Parallel Constant Time

Authors: Daniel Alexander Albert

Published in: LIPIcs, Volume 369, 37th Annual Symposium on Combinatorial Pattern Matching (CPM 2026)

Abstract

A longest common extension (LCE) query on a string computes the length of the longest common suffix or prefix at two given positions. A dynamic LCE algorithm maintains a data structure that allows efficient LCE queries on a string that can change via character insertions and deletions. A dynamic parallel constant-time algorithm is presented that can maintain LCE queries on a common CRCW PRAM with 𝒪(n^ε) work, for any ε > 0. The algorithm maintains a string synchronizing sets hierarchy, which it uses to answer substring equality queries, which it in turn uses to answer LCE queries. To achieve constant runtime, the algorithm allows parts of its information to become outdated by up to log n log^* n updates. It answers queries by combining this slightly outdated information with a list of the recent changes. Two applications of this dynamic LCE algorithm are shown. Firstly, a dynamic parallel constant-time algorithm can maintain membership in a Dyck language D_k, k > 0 with 𝒪(n^ε) work for any ε > 0. Secondly, a dynamic parallel constant-time algorithm can maintain squares with 𝒪(n^ε) work for any ε > 0.

Cite as

Daniel Alexander Albert. Longest Common Extension of a Dynamic String in Parallel Constant Time. In 37th Annual Symposium on Combinatorial Pattern Matching (CPM 2026). Leibniz International Proceedings in Informatics (LIPIcs), Volume 369, pp. 20:1-20:21, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2026)

Copy BibTex To Clipboard

@InProceedings{albert:LIPIcs.CPM.2026.20,
  author =	{Albert, Daniel Alexander},
  title =	{{Longest Common Extension of a Dynamic String in Parallel Constant Time}},
  booktitle =	{37th Annual Symposium on Combinatorial Pattern Matching (CPM 2026)},
  pages =	{20:1--20:21},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-420-8},
  ISSN =	{1868-8969},
  year =	{2026},
  volume =	{369},
  editor =	{Bille, Philip and Prezza, Nicola},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2026.20},
  URN =		{urn:nbn:de:0030-drops-259467},
  doi =		{10.4230/LIPIcs.CPM.2026.20},
  annote =	{Keywords: Dynamic Strings, Work, Parallel Constant Time, Longest Common Extension, Longest Common Prefix}
}

Document

DOI: 10.4230/LIPIcs.CPM.2026.27

Exploring the Gap Between LCS and LCStr

Authors: Shay Golan, Matan Kraus, Ely Porat, and B. Riva Shalom

Published in: LIPIcs, Volume 369, 37th Annual Symposium on Combinatorial Pattern Matching (CPM 2026)

Abstract

The Longest Common Subsequence (LCS) problem and the Longest Common Substring (LCStr) problem are classical string problems with broad theoretical and practical significance. The former has a quadratic conditional lower bound [FOCS, 2015], while the latter admits a linear-time solution. In this paper, we study a natural variation of these problems, the Longest Common Subsequence-Substring (LCSS) problem. The LCSS problem seeks the longest string that is simultaneously a subsequence of one input string and a substring of the other. This variant bridges LCS and LCStr, raising intriguing algorithmic questions: Does the complexity of computing LCSS interpolate between the linear time of LCStr and the quadratic time of LCS? What about approximability? We also examine a natural extension of LCSS to multiple strings, parameterizing the balance between subsequence and substring requirements. Our results reveal several insights. First, under the SETH conjecture, the inherent complexity of LCSS is quadratic, similar to LCS. In contrast, we provide a linear-time approximation for LCSS. Finally, for the multi-string variant, unlike both problems, we design a quadratic-time algorithm, uncovering deeper structural properties of the problem. By studying the complexity of the LCSS problem, we aim to gain some understanding of what influences whether a variant of the LCS problem behaves more like the standard LCS or like LCStr. Our findings suggest that hybrid constraints can create computational "sweet spots," where problems become more tractable than their pure counterparts. This opens a broader research direction in constraint-mediated algorithm design. Beyond LCSS itself, our work highlights unexpected connections between subsequence and substring constraints, advancing the theoretical understanding of string problems and laying the foundation for new algorithmic techniques and complexity-theoretic insights in the rich space between classical string comparison paradigms.

Cite as

Shay Golan, Matan Kraus, Ely Porat, and B. Riva Shalom. Exploring the Gap Between LCS and LCStr. In 37th Annual Symposium on Combinatorial Pattern Matching (CPM 2026). Leibniz International Proceedings in Informatics (LIPIcs), Volume 369, pp. 27:1-27:21, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2026)

Copy BibTex To Clipboard

@InProceedings{golan_et_al:LIPIcs.CPM.2026.27,
  author =	{Golan, Shay and Kraus, Matan and Porat, Ely and Shalom, B. Riva},
  title =	{{Exploring the Gap Between LCS and LCStr}},
  booktitle =	{37th Annual Symposium on Combinatorial Pattern Matching (CPM 2026)},
  pages =	{27:1--27:21},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-420-8},
  ISSN =	{1868-8969},
  year =	{2026},
  volume =	{369},
  editor =	{Bille, Philip and Prezza, Nicola},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2026.27},
  URN =		{urn:nbn:de:0030-drops-259535},
  doi =		{10.4230/LIPIcs.CPM.2026.27},
  annote =	{Keywords: Longest Common Subsequence, Longest Common Substring, Conditional Lower Bound}
}

Document

DOI: 10.4230/LIPIcs.CPM.2026.32

Balancing Two-Dimensional Straight-Line Programs

Authors: Itai Boneh, Estéban Gabory, Paweł Gawrychowski, and Adam Górkiewicz

Published in: LIPIcs, Volume 369, 37th Annual Symposium on Combinatorial Pattern Matching (CPM 2026)

Abstract

We consider building, given a straight-line program (SLP) consisting of g productions deriving a two-dimensional string T of size N× N, a structure capable of providing random access to any character of T. For one-dimensional strings, it is now known how to build a structure of size 𝒪(g) that provides random access in 𝒪(log N) time. In fact, it is known that this can be obtained by building an equivalent SLP of size 𝒪(g) and depth 𝒪(log N) [Ganardi, Jeż, Lohrey, JACM 2021]. We consider the analogous question for two-dimensional strings: can we build an equivalent SLP of roughly the same size and small depth? We show that the answer is negative: there exists an infinite family of two-dimensional strings of size N× N described by a 2D SLP of size g such that any 2D SLP of depth 𝒪(log N) describing the same string must be of size Ω(g⋅ N/log³N). We complement this with an upper bound showing how to construct such a 2D SLP of size 𝒪(g⋅ N). Next, we observe that one can naturally define a generalization of 2D SLP, which we call 2D SLP with holes. We show that a known general balancing theorem by [Ganardi, Jeż, Lohrey, JACM 2021] immediately implies that, given a 2D SLP of size g deriving a string of size N× N, we can construct a 2D SLP with holes of depth 𝒪(log N) and size 𝒪(g). This allows us to conclude that there is a structure of size 𝒪(g) providing random access in 𝒪(log N) time for such a 2D SLP. Further, this can be extended (analogously as for a 1D SLP) to obtain a structure of size 𝒪(g log^ε N) providing random access in 𝒪(log N/log log N) time, for any ε > 0. The same (optimal) random access time was very recently achieved by [De and Kempa, SODA 2026], but with a significantly larger structure of size 𝒪(g log^{2+ε} N).

Cite as

Itai Boneh, Estéban Gabory, Paweł Gawrychowski, and Adam Górkiewicz. Balancing Two-Dimensional Straight-Line Programs. In 37th Annual Symposium on Combinatorial Pattern Matching (CPM 2026). Leibniz International Proceedings in Informatics (LIPIcs), Volume 369, pp. 32:1-32:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2026)

Copy BibTex To Clipboard

@InProceedings{boneh_et_al:LIPIcs.CPM.2026.32,
  author =	{Boneh, Itai and Gabory, Est\'{e}ban and Gawrychowski, Pawe{\l} and G\'{o}rkiewicz, Adam},
  title =	{{Balancing Two-Dimensional Straight-Line Programs}},
  booktitle =	{37th Annual Symposium on Combinatorial Pattern Matching (CPM 2026)},
  pages =	{32:1--32:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-420-8},
  ISSN =	{1868-8969},
  year =	{2026},
  volume =	{369},
  editor =	{Bille, Philip and Prezza, Nicola},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2026.32},
  URN =		{urn:nbn:de:0030-drops-259582},
  doi =		{10.4230/LIPIcs.CPM.2026.32},
  annote =	{Keywords: Two-dimensional string, straight-line program, random access}
}

Document

DOI: 10.4230/LIPIcs.STACS.2026.26

Approximate Cartesian Tree Matching with Substitutions

Authors: Panagiotis Charalampopoulos, Jonas Ellert, and Manal Mohamed

Published in: LIPIcs, Volume 364, 43rd International Symposium on Theoretical Aspects of Computer Science (STACS 2026)

Abstract

The Cartesian tree of a sequence captures the relative order of the sequence’s elements. In recent years, Cartesian tree matching has attracted considerable attention, particularly due to its applications in time series analysis. Consider a text T of length n and a pattern P of length m. In the exact Cartesian tree matching problem, the task is to find all length-m fragments of T whose Cartesian tree coincides with the Cartesian tree CT(P) of the pattern. Although the exact version of the problem can be solved in linear time [Park et al., TCS 2020], it remains rather restrictive; for example, it is not robust to outliers in the pattern. To overcome this limitation, we consider the approximate setting, where the goal is to identify all fragments of T that are close to some string whose Cartesian tree matches CT(P). In this work, we quantify closeness via the widely used Hamming distance metric. For a given integer parameter k > 0, we present an algorithm that computes all fragments of T that are at Hamming distance at most k from a string whose Cartesian tree matches CT(P). Our algorithm runs in time 𝒪(n √m ⋅ k^{2.5}) for k ≤ m^{1/5} and in time 𝒪(nk⁵) for k ≥ m^{1/5}, thereby improving upon the state-of-the-art 𝒪(nmk)-time algorithm of Kim and Han [TCS 2025] in the regime k = o(m^{1/4}). On the way to our solution, we develop a toolbox of independent interest. First, we introduce a new notion of periodicity in Cartesian trees. Then, we lift multiple well-known combinatorial and algorithmic results for string matching and periodicity in strings to Cartesian tree matching and periodicity in Cartesian trees.

Cite as

Panagiotis Charalampopoulos, Jonas Ellert, and Manal Mohamed. Approximate Cartesian Tree Matching with Substitutions. In 43rd International Symposium on Theoretical Aspects of Computer Science (STACS 2026). Leibniz International Proceedings in Informatics (LIPIcs), Volume 364, pp. 26:1-26:21, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2026)

Copy BibTex To Clipboard

@InProceedings{charalampopoulos_et_al:LIPIcs.STACS.2026.26,
  author =	{Charalampopoulos, Panagiotis and Ellert, Jonas and Mohamed, Manal},
  title =	{{Approximate Cartesian Tree Matching with Substitutions}},
  booktitle =	{43rd International Symposium on Theoretical Aspects of Computer Science (STACS 2026)},
  pages =	{26:1--26:21},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-412-3},
  ISSN =	{1868-8969},
  year =	{2026},
  volume =	{364},
  editor =	{Mahajan, Meena and Manea, Florin and McIver, Annabelle and Thắng, Nguy\~{ê}n Kim},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.STACS.2026.26},
  URN =		{urn:nbn:de:0030-drops-255151},
  doi =		{10.4230/LIPIcs.STACS.2026.26},
  annote =	{Keywords: Cartesian tree, Hamming distance, approximate pattern matching}
}

Document

DOI: 10.4230/LIPIcs.STACS.2026.62

Relative Compressed Reverse Suffix Array

Authors: Muhammed Oguzhan Kulekci, Mano Prakash Parthasarathi, Rahul Shah, and Sharma V. Thankachan

Published in: LIPIcs, Volume 364, 43rd International Symposium on Theoretical Aspects of Computer Science (STACS 2026)

Abstract

Suffix trees and suffix arrays are two fundamental data structures in the field of string algorithms. For a string (a.k.a. text or sequence) of length n over an alphabet of size σ, these structures typically require O(nlog n) bits of space. The FM-index provides a compressed representation of the suffix array in ≈ nlog σ bits, allowing for efficient queries on both the suffix array and its inverse array in near logarithmic time. In certain applications, such as approximate pattern matching (i.e., with wildcards, mismatches, edits), there is a need to access the suffix array of a text, as well as the suffix array of text’s reverse. Motivated by this, we explore the possibility of encoding the suffix array of the reversed text in a compact form, assuming the availability of the FM-index for the original text. Our first solution is an O(n)-bit (relative) encoding of the suffix array of the reversed text, with the time for decoding an entry being only O(log^*n) times that of decoding an entry in the text’s suffix array using FM-index. We then demonstrate how to reduce the space to O(n/κ) bits for a parameter κ, while multiplicative factor in time becomes approximately O(κlog^*n+κ³). We can also support inverse suffix array and longest common extension queries on the reversed text. These results are achieved through some careful and non-trivial application of various succinct data structure techniques.

Cite as

Muhammed Oguzhan Kulekci, Mano Prakash Parthasarathi, Rahul Shah, and Sharma V. Thankachan. Relative Compressed Reverse Suffix Array. In 43rd International Symposium on Theoretical Aspects of Computer Science (STACS 2026). Leibniz International Proceedings in Informatics (LIPIcs), Volume 364, pp. 62:1-62:21, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2026)

Copy BibTex To Clipboard

@InProceedings{kulekci_et_al:LIPIcs.STACS.2026.62,
  author =	{Kulekci, Muhammed Oguzhan and Parthasarathi, Mano Prakash and Shah, Rahul and Thankachan, Sharma V.},
  title =	{{Relative Compressed Reverse Suffix Array}},
  booktitle =	{43rd International Symposium on Theoretical Aspects of Computer Science (STACS 2026)},
  pages =	{62:1--62:21},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-412-3},
  ISSN =	{1868-8969},
  year =	{2026},
  volume =	{364},
  editor =	{Mahajan, Meena and Manea, Florin and McIver, Annabelle and Thắng, Nguy\~{ê}n Kim},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.STACS.2026.62},
  URN =		{urn:nbn:de:0030-drops-255512},
  doi =		{10.4230/LIPIcs.STACS.2026.62},
  annote =	{Keywords: String Matching, Text Indexing, Data Structures, Suffix Trees}
}

Document

DOI: 10.4230/LIPIcs.STACS.2026.68

Dynamic Pattern Matching with Wildcards

Authors: Arshia Ataee Naeini, Amir-Parsa Mobed, Masoud Seddighin, and Saeed Seddighin

Published in: LIPIcs, Volume 364, 43rd International Symposium on Theoretical Aspects of Computer Science (STACS 2026)

Abstract

We study the fully dynamic pattern matching problem where the pattern may contain up to k wildcard symbols, each matching any symbol of the alphabet. Both the text and the pattern are subject to updates (insert, delete, change). We design an algorithm with 𝒪(n log² n) preprocessing and update/query time 𝒪̃(kn^{k/{k+1}} + k² log n). The bound is truly sublinear for a constant k, and sublinear when k = o(log n). We further complement our results with a conditional lower bound: assuming subquadratic preprocessing time, achieving truly sublinear update time for the case k = Ω(log n) would contradict the Strong Exponential Time Hypothesis (SETH). Finally, we develop sublinear algorithms for two special cases: - If the pattern contains w non-wildcard symbols, we give an algorithm with preprocessing time 𝒪(nw) and update time 𝒪(w + log n), which is truly sublinear whenever w is truly sublinear. - Using FFT technique combined with block decomposition, we design a deterministic truly sublinear algorithm with preprocessing time 𝒪(n^{1.8}) and update time 𝒪(n^{0.8} log n) for the case that there are at most two non-wildcards.

Cite as

Arshia Ataee Naeini, Amir-Parsa Mobed, Masoud Seddighin, and Saeed Seddighin. Dynamic Pattern Matching with Wildcards. In 43rd International Symposium on Theoretical Aspects of Computer Science (STACS 2026). Leibniz International Proceedings in Informatics (LIPIcs), Volume 364, pp. 68:1-68:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2026)

Copy BibTex To Clipboard

@InProceedings{naeini_et_al:LIPIcs.STACS.2026.68,
  author =	{Naeini, Arshia Ataee and Mobed, Amir-Parsa and Seddighin, Masoud and Seddighin, Saeed},
  title =	{{Dynamic Pattern Matching with Wildcards}},
  booktitle =	{43rd International Symposium on Theoretical Aspects of Computer Science (STACS 2026)},
  pages =	{68:1--68:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-412-3},
  ISSN =	{1868-8969},
  year =	{2026},
  volume =	{364},
  editor =	{Mahajan, Meena and Manea, Florin and McIver, Annabelle and Thắng, Nguy\~{ê}n Kim},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.STACS.2026.68},
  URN =		{urn:nbn:de:0030-drops-255579},
  doi =		{10.4230/LIPIcs.STACS.2026.68},
  annote =	{Keywords: pattern matching, wildcards, dynamic algorithms, string algorithms, data structures}
}

Document

DOI: 10.4230/LIPIcs.ISAAC.2025.9

Small Space Encoding and Recognition of k-Palindromic Prefixes

Authors: Gabriel Bathie, Jonas Ellert, and Tatiana Starikovskaya

Published in: LIPIcs, Volume 359, 36th International Symposium on Algorithms and Computation (ISAAC 2025)

Abstract

Palindromes are non-empty strings that read the same forward and backward. We study the problem of recognizing so-called k-palindromic strings, which can be represented as the concatenation of exactly k palindromes. [Rubinchik and Shur, MFCS 2020] showed that the problem is solvable in linear space and time. We present a read-only algorithm that recognizes all k-palindromic prefixes of a string T of length n in O(n ⋅ 6^{k²} ⋅ log^k n) time and O(6^{k²} ⋅ log^k n) space. As a corollary, we also obtain a read-only algorithm for computing the palindromic length of T, i.e., the smallest k such that T is k-palindromic, in O(n ⋅ 6^{k²} ⋅ log^⌈k/2⌉ n) time and O(6^{k²} ⋅ log^⌈k/2⌉ n) space.

Cite as

Gabriel Bathie, Jonas Ellert, and Tatiana Starikovskaya. Small Space Encoding and Recognition of k-Palindromic Prefixes. In 36th International Symposium on Algorithms and Computation (ISAAC 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 359, pp. 9:1-9:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{bathie_et_al:LIPIcs.ISAAC.2025.9,
  author =	{Bathie, Gabriel and Ellert, Jonas and Starikovskaya, Tatiana},
  title =	{{Small Space Encoding and Recognition of k-Palindromic Prefixes}},
  booktitle =	{36th International Symposium on Algorithms and Computation (ISAAC 2025)},
  pages =	{9:1--9:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-408-6},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{359},
  editor =	{Chen, Ho-Lin and Hon, Wing-Kai and Tsai, Meng-Tsung},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ISAAC.2025.9},
  URN =		{urn:nbn:de:0030-drops-249178},
  doi =		{10.4230/LIPIcs.ISAAC.2025.9},
  annote =	{Keywords: palindromic length, read-only algorithms, palindromes}
}

59 Search Results for "Amir, Amihood"

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Thanks for your feedback!

Could not send message