Document

**Published in:** LIPIcs, Volume 289, 41st International Symposium on Theoretical Aspects of Computer Science (STACS 2024)

In Gapped String Indexing, the goal is to compactly represent a string S of length n such that for any query consisting of two strings P₁ and P₂, called patterns, and an integer interval [α, β], called gap range, we can quickly find occurrences of P₁ and P₂ in S with distance in [α, β]. Gapped String Indexing is a central problem in computational biology and text mining and has thus received significant research interest, including parameterized and heuristic approaches. Despite this interest, the best-known time-space trade-offs for Gapped String Indexing are the straightforward 𝒪(n) space and 𝒪(n+ occ) query time or Ω(n²) space and Õ(|P₁| + |P₂| + occ) query time.
We break through this barrier obtaining the first interesting trade-offs with polynomially subquadratic space and polynomially sublinear query time. In particular, we show that, for every 0 ≤ δ ≤ 1, there is a data structure for Gapped String Indexing with either Õ(n^{2-δ/3}) or Õ(n^{3-2δ}) space and Õ(|P₁| + |P₂| + n^{δ}⋅ (occ+1)) query time, where occ is the number of reported occurrences.
As a new fundamental tool towards obtaining our main result, we introduce the Shifted Set Intersection problem: preprocess a collection of sets S₁, …, S_k of integers such that for any query consisting of three integers i,j,s, we can quickly output YES if and only if there exist a ∈ S_i and b ∈ S_j with a+s = b. We start by showing that the Shifted Set Intersection problem is equivalent to the indexing variant of 3SUM (3SUM Indexing) [Golovnev et al., STOC 2020]. We then give a data structure for Shifted Set Intersection with gaps, which entails a solution to the Gapped String Indexing problem. Furthermore, we enhance our data structure for deciding Shifted Set Intersection, so that we can support the reporting variant of the problem, i.e., outputting all certificates in the affirmative case. Via the obtained equivalence to 3SUM Indexing, we thus give new improved data structures for the reporting variant of 3SUM Indexing, and we show how this improves upon the state-of-the-art solution for Jumbled Indexing [Chan and Lewenstein, STOC 2015] for any alphabet of constant size σ > 5.

Philip Bille, Inge Li Gørtz, Moshe Lewenstein, Solon P. Pissis, Eva Rotenberg, and Teresa Anna Steiner. Gapped String Indexing in Subquadratic Space and Sublinear Query Time. In 41st International Symposium on Theoretical Aspects of Computer Science (STACS 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 289, pp. 16:1-16:21, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{bille_et_al:LIPIcs.STACS.2024.16, author = {Bille, Philip and G{\o}rtz, Inge Li and Lewenstein, Moshe and Pissis, Solon P. and Rotenberg, Eva and Steiner, Teresa Anna}, title = {{Gapped String Indexing in Subquadratic Space and Sublinear Query Time}}, booktitle = {41st International Symposium on Theoretical Aspects of Computer Science (STACS 2024)}, pages = {16:1--16:21}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-311-9}, ISSN = {1868-8969}, year = {2024}, volume = {289}, editor = {Beyersdorff, Olaf and Kant\'{e}, Mamadou Moustapha and Kupferman, Orna and Lokshtanov, Daniel}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.STACS.2024.16}, URN = {urn:nbn:de:0030-drops-197262}, doi = {10.4230/LIPIcs.STACS.2024.16}, annote = {Keywords: data structures, string indexing, indexing with gaps, two patterns} }

Document

**Published in:** LIPIcs, Volume 259, 34th Annual Symposium on Combinatorial Pattern Matching (CPM 2023)

A factorization of a string S is a partition of w into substrings u_1,… ,u_k such that S = u_1 u_2 ⋯ u_k. Such a partition is called equality-free if no two factors are equal: u_i ≠ u_j, ∀ i,j with i ≠ j. The maximum equality-free factorization problem is to find for a given string S, the largest integer k for which S admits an equality-free factorization with k factors.
Equality-free factorizations have lately received attention because of their applications in DNA self-assembly. The best approximation algorithm known for the problem is the natural greedy algorithm, that chooses iteratively from left to right the shortest factor that does not appear before. This algorithm has a √n approximation ratio (SOFSEM 2020) and it is an open problem whether there is a better solution.
Our main result is to show that the natural greedy algorithm is a Θ(n^{1/4}) approximation algorithm for the maximum equality-free factorization problem. Thus, we disprove one of the conjectures of Mincu and Popa (SOFSEM 2020) according to which the greedy algorithm is a Θ(√n) approximation.
The most challenging part of the proof is to show that the greedy algorithm is an O(n^{1/4}) approximation. We obtain this algorithm via prefix free factor families, i.e. a set of non-overlapping factors of the string which are pairwise non-prefixes of each other. In the paper we show the relation between prefix free factor families and the maximum equality-free factorization. Moreover, as a byproduct we present another approximation algorithm that achieves an approximation ratio of O(n^{1/4}) that we believe is of independent interest and may lead to improved algorithms. We then show that the natural greedy algorithm has an approximation ratio that is Ω(n^{1/4}) via a clever analysis which shows that the greedy algorithm is Θ(n^{1/4}) for the maximum equality-free factorization problem.

Matan Kraus, Moshe Lewenstein, Alexandru Popa, Ely Porat, and Yonathan Sadia. String Factorization via Prefix Free Families. In 34th Annual Symposium on Combinatorial Pattern Matching (CPM 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 259, pp. 19:1-19:10, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@InProceedings{kraus_et_al:LIPIcs.CPM.2023.19, author = {Kraus, Matan and Lewenstein, Moshe and Popa, Alexandru and Porat, Ely and Sadia, Yonathan}, title = {{String Factorization via Prefix Free Families}}, booktitle = {34th Annual Symposium on Combinatorial Pattern Matching (CPM 2023)}, pages = {19:1--19:10}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-276-1}, ISSN = {1868-8969}, year = {2023}, volume = {259}, editor = {Bulteau, Laurent and Lipt\'{a}k, Zsuzsanna}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2023.19}, URN = {urn:nbn:de:0030-drops-179738}, doi = {10.4230/LIPIcs.CPM.2023.19}, annote = {Keywords: string factorization, NP-hard problem, approximation algorithm} }

Document

**Published in:** LIPIcs, Volume 149, 30th International Symposium on Algorithms and Computation (ISAAC 2019)

In the SetDisjointness problem, a collection of m sets S_1,S_2,...,S_m from some universe U is preprocessed in order to answer queries on the emptiness of the intersection of some two query sets from the collection. In the SetIntersection variant, all the elements in the intersection of the query sets are required to be reported. These are two fundamental problems that were considered in several papers from both the upper bound and lower bound perspective.
Several conditional lower bounds for these problems were proven for the tradeoff between preprocessing and query time or the tradeoff between space and query time. Moreover, there are several unconditional hardness results for these problems in some specific computational models. The fundamental nature of the SetDisjointness and SetIntersection problems makes them useful for proving the conditional hardness of other problems from various areas. However, the universe of the elements in the sets may be very large, which may cause the reduction to some other problems to be inefficient and therefore it is not useful for proving their conditional hardness.
In this paper, we prove the conditional hardness of SetDisjointness and SetIntersection with bounded universe. This conditional hardness is shown for both the interplay between preprocessing and query time and the interplay between space and query time. Moreover, we present several applications of these new conditional lower bounds. These applications demonstrates the strength of our new conditional lower bounds as they exploit the limited universe size. We believe that this new framework of conditional lower bounds with bounded universe can be useful for further significant applications.

Isaac Goldstein, Moshe Lewenstein, and Ely Porat. On the Hardness of Set Disjointness and Set Intersection with Bounded Universe. In 30th International Symposium on Algorithms and Computation (ISAAC 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 149, pp. 7:1-7:22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)

Copy BibTex To Clipboard

@InProceedings{goldstein_et_al:LIPIcs.ISAAC.2019.7, author = {Goldstein, Isaac and Lewenstein, Moshe and Porat, Ely}, title = {{On the Hardness of Set Disjointness and Set Intersection with Bounded Universe}}, booktitle = {30th International Symposium on Algorithms and Computation (ISAAC 2019)}, pages = {7:1--7:22}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-130-6}, ISSN = {1868-8969}, year = {2019}, volume = {149}, editor = {Lu, Pinyan and Zhang, Guochuan}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ISAAC.2019.7}, URN = {urn:nbn:de:0030-drops-115036}, doi = {10.4230/LIPIcs.ISAAC.2019.7}, annote = {Keywords: set disjointness, set intersection, 3SUM, space-time tradeoff, conditional lower bounds} }

Document

**Published in:** LIPIcs, Volume 112, 26th Annual European Symposium on Algorithms (ESA 2018)

In the kSUM problem we are given an array of numbers a_1,a_2,...,a_n and we are required to determine if there are k different elements in this array such that their sum is 0. This problem is a parameterized version of the well-studied SUBSET-SUM problem, and a special case is the 3SUM problem that is extensively used for proving conditional hardness. Several works investigated the interplay between time and space in the context of SUBSET-SUM. Recently, improved time-space tradeoffs were proven for kSUM using both randomized and deterministic algorithms.
In this paper we obtain an improvement over the best known results for the time-space tradeoff for kSUM. A major ingredient in achieving these results is a general self-reduction from kSUM to mSUM where m<k, and several useful observations that enable this reduction and its implications. The main results we prove in this paper include the following: (i) The best known Las Vegas solution to kSUM running in approximately O(n^{k-delta sqrt{2k}}) time and using O(n^{delta}) space, for 0 <= delta <= 1. (ii) The best known deterministic solution to kSUM running in approximately O(n^{k-delta sqrt{k}}) time and using O(n^{delta}) space, for 0 <= delta <= 1. (iii) A space-time tradeoff for solving kSUM using O(n^{delta}) space, for delta>1. (iv) An algorithm for 6SUM running in O(n^4) time using just O(n^{2/3}) space. (v) A solution to 3SUM on random input using O(n^2) time and O(n^{1/3}) space, under the assumption of a random read-only access to random bits.

Isaac Goldstein, Moshe Lewenstein, and Ely Porat. Improved Space-Time Tradeoffs for kSUM. In 26th Annual European Symposium on Algorithms (ESA 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 112, pp. 37:1-37:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)

Copy BibTex To Clipboard

@InProceedings{goldstein_et_al:LIPIcs.ESA.2018.37, author = {Goldstein, Isaac and Lewenstein, Moshe and Porat, Ely}, title = {{Improved Space-Time Tradeoffs for kSUM}}, booktitle = {26th Annual European Symposium on Algorithms (ESA 2018)}, pages = {37:1--37:14}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-081-1}, ISSN = {1868-8969}, year = {2018}, volume = {112}, editor = {Azar, Yossi and Bast, Hannah and Herman, Grzegorz}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2018.37}, URN = {urn:nbn:de:0030-drops-95000}, doi = {10.4230/LIPIcs.ESA.2018.37}, annote = {Keywords: kSUM, space-time tradeoff, self-reduction} }

Document

**Published in:** LIPIcs, Volume 92, 28th International Symposium on Algorithms and Computation (ISAAC 2017)

In the recent years, intensive research work has been dedicated to prove conditional lower bounds in order to reveal the inner structure of the class P. These conditional lower bounds are based on many popular conjectures on well-studied problems. One of the most heavily used conjectures is the celebrated Strong Exponential Time Hypothesis (SETH). It turns out that conditional hardness proved based on SETH goes, in many cases, through an intermediate problem - the Orthogonal Vectors (OV) problem.
Almost all research work regarding conditional lower bound was concentrated on time complexity. Very little attention was directed toward space complexity. In a recent work, Goldstein et al.[WADS '17] set the stage for proving conditional lower bounds regarding space and its interplay with time. In this spirit, it is tempting to investigate the space complexity of a data structure variant of OV which is called OV indexing. In this problem n boolean vectors of size clogn are given for preprocessing. As a query, a vector v is given and we are required to verify if there is an input vector that is orthogonal to it or not.
This OV indexing problem is interesting in its own, but it also likely to have strong implications on problems known to be conditionally hard, in terms of time complexity, based on OV. Having this in mind, we study OV indexing in this paper from many aspects. We give some space-efficient algorithms for the problem, show a tradeoff between space and query time, describe how to solve its reporting variant, shed light on an interesting connection between this problem and the well-studied SetDisjointness problem and demonstrate how it can be solved more efficiently on random input.

Isaac Goldstein, Moshe Lewenstein, and Ely Porat. Orthogonal Vectors Indexing. In 28th International Symposium on Algorithms and Computation (ISAAC 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 92, pp. 40:1-40:12, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)

Copy BibTex To Clipboard

@InProceedings{goldstein_et_al:LIPIcs.ISAAC.2017.40, author = {Goldstein, Isaac and Lewenstein, Moshe and Porat, Ely}, title = {{Orthogonal Vectors Indexing}}, booktitle = {28th International Symposium on Algorithms and Computation (ISAAC 2017)}, pages = {40:1--40:12}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-054-5}, ISSN = {1868-8969}, year = {2017}, volume = {92}, editor = {Okamoto, Yoshio and Tokuyama, Takeshi}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ISAAC.2017.40}, URN = {urn:nbn:de:0030-drops-82395}, doi = {10.4230/LIPIcs.ISAAC.2017.40}, annote = {Keywords: SETH, orthogonal vectors, space complexity} }

Document

**Published in:** LIPIcs, Volume 78, 28th Annual Symposium on Combinatorial Pattern Matching (CPM 2017)

Data analysis typically involves error recovery and detection of regularities as two different key tasks. In this paper we show that there are data types for which these two tasks can be powerfully combined. A common notion of regularity in strings is that of a cover. Data describing measures of a natural coverable phenomenon may be corrupted by errors caused by the measurement process, or by the inexact features of the phenomenon itself. Due to this reason, different variants of approximate covers have been introduced, some of which are NP-hard to compute. In this paper we assume that the Hamming distance metric measures the amount of corruption experienced, and study the problem of recovering the correct cover from data corrupted by mismatch errors, formally defined as the cover recovery problem (CRP). We show that for the Hamming distance metric, coverability is a powerful property allowing detecting the original cover and correcting the data, under suitable conditions.
We also study a relaxation of another problem, which is called the approximate cover problem (ACP). Since the ACP is proved to be NP-hard [Amir,Levy,Lubin,Porat, CPM 2017], we study a relaxation, which we call the candidate-relaxation of the ACP, and show it has a polynomial time complexity. As a result, we get that the ACP also has a polynomial time complexity in many practical situations. An important application of our ACP relaxation study is also a polynomial time algorithm for the cover recovery problem (CRP).

Amihood Amir, Avivit Levy, Moshe Lewenstein, Ronit Lubin, and Benny Porat. Can We Recover the Cover?. In 28th Annual Symposium on Combinatorial Pattern Matching (CPM 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 78, pp. 25:1-25:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)

Copy BibTex To Clipboard

@InProceedings{amir_et_al:LIPIcs.CPM.2017.25, author = {Amir, Amihood and Levy, Avivit and Lewenstein, Moshe and Lubin, Ronit and Porat, Benny}, title = {{Can We Recover the Cover?}}, booktitle = {28th Annual Symposium on Combinatorial Pattern Matching (CPM 2017)}, pages = {25:1--25:15}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-039-2}, ISSN = {1868-8969}, year = {2017}, volume = {78}, editor = {K\"{a}rkk\"{a}inen, Juha and Radoszewski, Jakub and Rytter, Wojciech}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2017.25}, URN = {urn:nbn:de:0030-drops-73190}, doi = {10.4230/LIPIcs.CPM.2017.25}, annote = {Keywords: periodicity, quasi-periodicity, cover, approximate cover, data recovery} }

Document

**Published in:** Dagstuhl Reports, Volume 6, Issue 11 (2017)

This document contains description of the talks at the Dagstuhl seminar 16451 "Structure and Hardness in P". The main goal of the seminar was to bring together researchers from several disciplines and connect those who work on proving conditional lower bounds with those who or may benefit from it. This resulted in an extensive list of open problems which is also provided.

Moshe Lewenstein, Seth Pettie, and Virginia Vassilevska Williams. Structure and Hardness in P (Dagstuhl Seminar 16451). In Dagstuhl Reports, Volume 6, Issue 11, pp. 1-34, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)

Copy BibTex To Clipboard

@Article{lewenstein_et_al:DagRep.6.11.1, author = {Lewenstein, Moshe and Pettie, Seth and Vassilevska Williams, Virginia}, title = {{Structure and Hardness in P (Dagstuhl Seminar 16451)}}, pages = {1--34}, journal = {Dagstuhl Reports}, ISSN = {2192-5283}, year = {2017}, volume = {6}, number = {11}, editor = {Lewenstein, Moshe and Pettie, Seth and Vassilevska Williams, Virginia}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/DagRep.6.11.1}, URN = {urn:nbn:de:0030-drops-70373}, doi = {10.4230/DagRep.6.11.1}, annote = {Keywords: Algorithmic equivalences, Classifying P, Hardness assumptions, Lower bounds} }

Document

**Published in:** LIPIcs, Volume 57, 24th Annual European Symposium on Algorithms (ESA 2016)

In recent years much effort has been put into developing polynomial-time conditional lower bounds for algorithms and data structures in both static and dynamic settings. Along these lines we introduce a framework for proving conditional lower bounds based on the well-known 3SUM conjecture. Our framework creates a compact representation of an instance of the 3SUM problem using hashing and domain specific encoding. This compact representation admits false solutions to the original 3SUM problem instance which we reveal and eliminate until we find a true solution. In other words, from all witnesses (candidate solutions) we figure out if an honest one (a true solution) exists. This enumeration of witnesses is used to prove conditional lower bounds on reporting problems that generate all witnesses. In turn, these reporting problems are then reduced to various decision problems using special search data structures which are able to enumerate the witnesses while only using solutions to decision variants. Hence, 3SUM-hardness of the decision problems is deduced.
We utilize this framework to show conditional lower bounds for several variants of convolutions, matrix multiplication and string problems. Our framework uses a strong connection between all of these problems and the ability to find witnesses.
Specifically, we prove conditional lower bounds for computing partial outputs of convolutions and matrix multiplication for sparse inputs. These problems are inspired by the open question raised by Muthukrishnan 20 years ago. The lower bounds we show rule out the possibility (unless the 3SUM conjecture is false) that almost linear time solutions to sparse input-output convolutions or matrix multiplications exist. This is in contrast to standard convolutions and matrix multiplications that have, or assumed to have, almost linear solutions.
Moreover, we improve upon the conditional lower bounds of Amir et al. for histogram indexing, a problem that has been of much interest recently. The conditional lower bounds we show apply for both reporting and decision variants. For the well-studied decision variant, we show a full tradeoff between preprocessing and query time for every alphabet size > 2. At an extreme, this implies that no solution to this problem exists with subquadratic preprocessing time and ~O(1) query time for every alphabet size > 2, unless the 3SUM conjecture is false. This is in contrast to a recent result by Chan and Lewenstein for a binary alphabet.
While these specific applications are used to demonstrate the techniques of our framework, we believe that this novel framework is useful for many other problems as well.

Isaac Goldstein, Tsvi Kopelowitz, Moshe Lewenstein, and Ely Porat. How Hard is it to Find (Honest) Witnesses?. In 24th Annual European Symposium on Algorithms (ESA 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 57, pp. 45:1-45:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)

Copy BibTex To Clipboard

@InProceedings{goldstein_et_al:LIPIcs.ESA.2016.45, author = {Goldstein, Isaac and Kopelowitz, Tsvi and Lewenstein, Moshe and Porat, Ely}, title = {{How Hard is it to Find (Honest) Witnesses?}}, booktitle = {24th Annual European Symposium on Algorithms (ESA 2016)}, pages = {45:1--45:16}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-015-6}, ISSN = {1868-8969}, year = {2016}, volume = {57}, editor = {Sankowski, Piotr and Zaroliagis, Christos}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2016.45}, URN = {urn:nbn:de:0030-drops-63575}, doi = {10.4230/LIPIcs.ESA.2016.45}, annote = {Keywords: 3SUM, convolutions, matrix multiplication, histogram indexing} }

Document

Complete Volume

**Published in:** LIPIcs, Volume 54, 27th Annual Symposium on Combinatorial Pattern Matching (CPM 2016)

LIPIcs, Volume 54, CPM'16, Complete Volume

27th Annual Symposium on Combinatorial Pattern Matching (CPM 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 54, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)

Copy BibTex To Clipboard

@Proceedings{grossi_et_al:LIPIcs.CPM.2016, title = {{LIPIcs, Volume 54, CPM'16, Complete Volume}}, booktitle = {27th Annual Symposium on Combinatorial Pattern Matching (CPM 2016)}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-012-5}, ISSN = {1868-8969}, year = {2016}, volume = {54}, editor = {Grossi, Roberto and Lewenstein, Moshe}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2016}, URN = {urn:nbn:de:0030-drops-60935}, doi = {10.4230/LIPIcs.CPM.2016}, annote = {Keywords: Data Structures, Data Storage Representations, Coding and Information Theory, Theory of Computation Discrete Mathematics, Information Systems} }

Document

Front Matter

**Published in:** LIPIcs, Volume 54, 27th Annual Symposium on Combinatorial Pattern Matching (CPM 2016)

Front Matter, Table of Contents, Preface, List of Authors

27th Annual Symposium on Combinatorial Pattern Matching (CPM 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 54, pp. 0:i-0:x, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)

Copy BibTex To Clipboard

@InProceedings{grossi_et_al:LIPIcs.CPM.2016.0, author = {Grossi, Roberto and Lewenstein, Moshe}, title = {{Front Matter, Table of Contents, Preface}}, booktitle = {27th Annual Symposium on Combinatorial Pattern Matching (CPM 2016)}, pages = {0:i--0:x}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-012-5}, ISSN = {1868-8969}, year = {2016}, volume = {54}, editor = {Grossi, Roberto and Lewenstein, Moshe}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2016.0}, URN = {urn:nbn:de:0030-drops-60916}, doi = {10.4230/LIPIcs.CPM.2016.0}, annote = {Keywords: Front Matter, Table of Contents, Preface, List of Authors} }

Document

**Published in:** LIPIcs, Volume 25, 31st International Symposium on Theoretical Aspects of Computer Science (STACS 2014)

In this paper we describe compressed indexes that support pattern matching queries for strings with wildcards. For a constant size alphabet our data structure uses O(n.log^e(n)) bits for any e>0 and reports all occ occurrences of a wildcard string in O(m+s^g.M(n)+occ) time, where M(n)=o(log(log(log(n)))), s is the alphabet size, m is the number of alphabet symbols and g is the number of wildcard symbols in the query string. We also present an O(n)-bit index with O((m+s^g+occ).log^e(n)) query time and an O(n{log(log(n))}^2)-bit index with O((m+s^g+occ).log(log(n))) query time. These are the first non-trivial data structures for this problem that need o(n.log(n)) bits of space.

Moshe Lewenstein, Yakov Nekrich, and Jeffrey Scott Vitter. Space-Efficient String Indexing for Wildcard Pattern Matching. In 31st International Symposium on Theoretical Aspects of Computer Science (STACS 2014). Leibniz International Proceedings in Informatics (LIPIcs), Volume 25, pp. 506-517, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2014)

Copy BibTex To Clipboard

@InProceedings{lewenstein_et_al:LIPIcs.STACS.2014.506, author = {Lewenstein, Moshe and Nekrich, Yakov and Vitter, Jeffrey Scott}, title = {{Space-Efficient String Indexing for Wildcard Pattern Matching}}, booktitle = {31st International Symposium on Theoretical Aspects of Computer Science (STACS 2014)}, pages = {506--517}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-939897-65-1}, ISSN = {1868-8969}, year = {2014}, volume = {25}, editor = {Mayr, Ernst W. and Portier, Natacha}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.STACS.2014.506}, URN = {urn:nbn:de:0030-drops-44838}, doi = {10.4230/LIPIcs.STACS.2014.506}, annote = {Keywords: compressed data structures, compressed indexes, pattern matching} }