27 Search Results for "Gortz, Inge Li"


Volume

LIPIcs, Volume 274

31st Annual European Symposium on Algorithms (ESA 2023)

ESA 2023, September 4-6, 2023, Amsterdam, The Netherlands

Editors: Inge Li Gørtz, Martin Farach-Colton, Simon J. Puglisi, and Grzegorz Herman

Volume

LIPIcs, Volume 161

31st Annual Symposium on Combinatorial Pattern Matching (CPM 2020)

CPM 2020, June 17-19, 2020, Copenhagen, Denmark

Editors: Inge Li Gørtz and Oren Weimann

Document
Gapped String Indexing in Subquadratic Space and Sublinear Query Time

Authors: Philip Bille, Inge Li Gørtz, Moshe Lewenstein, Solon P. Pissis, Eva Rotenberg, and Teresa Anna Steiner

Published in: LIPIcs, Volume 289, 41st International Symposium on Theoretical Aspects of Computer Science (STACS 2024)


Abstract
In Gapped String Indexing, the goal is to compactly represent a string S of length n such that for any query consisting of two strings P₁ and P₂, called patterns, and an integer interval [α, β], called gap range, we can quickly find occurrences of P₁ and P₂ in S with distance in [α, β]. Gapped String Indexing is a central problem in computational biology and text mining and has thus received significant research interest, including parameterized and heuristic approaches. Despite this interest, the best-known time-space trade-offs for Gapped String Indexing are the straightforward 𝒪(n) space and 𝒪(n+ occ) query time or Ω(n²) space and Õ(|P₁| + |P₂| + occ) query time. We break through this barrier obtaining the first interesting trade-offs with polynomially subquadratic space and polynomially sublinear query time. In particular, we show that, for every 0 ≤ δ ≤ 1, there is a data structure for Gapped String Indexing with either Õ(n^{2-δ/3}) or Õ(n^{3-2δ}) space and Õ(|P₁| + |P₂| + n^{δ}⋅ (occ+1)) query time, where occ is the number of reported occurrences. As a new fundamental tool towards obtaining our main result, we introduce the Shifted Set Intersection problem: preprocess a collection of sets S₁, …, S_k of integers such that for any query consisting of three integers i,j,s, we can quickly output YES if and only if there exist a ∈ S_i and b ∈ S_j with a+s = b. We start by showing that the Shifted Set Intersection problem is equivalent to the indexing variant of 3SUM (3SUM Indexing) [Golovnev et al., STOC 2020]. We then give a data structure for Shifted Set Intersection with gaps, which entails a solution to the Gapped String Indexing problem. Furthermore, we enhance our data structure for deciding Shifted Set Intersection, so that we can support the reporting variant of the problem, i.e., outputting all certificates in the affirmative case. Via the obtained equivalence to 3SUM Indexing, we thus give new improved data structures for the reporting variant of 3SUM Indexing, and we show how this improves upon the state-of-the-art solution for Jumbled Indexing [Chan and Lewenstein, STOC 2015] for any alphabet of constant size σ > 5.

Cite as

Philip Bille, Inge Li Gørtz, Moshe Lewenstein, Solon P. Pissis, Eva Rotenberg, and Teresa Anna Steiner. Gapped String Indexing in Subquadratic Space and Sublinear Query Time. In 41st International Symposium on Theoretical Aspects of Computer Science (STACS 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 289, pp. 16:1-16:21, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{bille_et_al:LIPIcs.STACS.2024.16,
  author =	{Bille, Philip and G{\o}rtz, Inge Li and Lewenstein, Moshe and Pissis, Solon P. and Rotenberg, Eva and Steiner, Teresa Anna},
  title =	{{Gapped String Indexing in Subquadratic Space and Sublinear Query Time}},
  booktitle =	{41st International Symposium on Theoretical Aspects of Computer Science (STACS 2024)},
  pages =	{16:1--16:21},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-311-9},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{289},
  editor =	{Beyersdorff, Olaf and Kant\'{e}, Mamadou Moustapha and Kupferman, Orna and Lokshtanov, Daniel},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.STACS.2024.16},
  URN =		{urn:nbn:de:0030-drops-197262},
  doi =		{10.4230/LIPIcs.STACS.2024.16},
  annote =	{Keywords: data structures, string indexing, indexing with gaps, two patterns}
}
Document
Complete Volume
LIPIcs, Volume 274, ESA 2023, Complete Volume

Authors: Inge Li Gørtz, Martin Farach-Colton, Simon J. Puglisi, and Grzegorz Herman

Published in: LIPIcs, Volume 274, 31st Annual European Symposium on Algorithms (ESA 2023)


Abstract
LIPIcs, Volume 274, ESA 2023, Complete Volume

Cite as

31st Annual European Symposium on Algorithms (ESA 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 274, pp. 1-1700, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)


Copy BibTex To Clipboard

@Proceedings{grtz_et_al:LIPIcs.ESA.2023,
  title =	{{LIPIcs, Volume 274, ESA 2023, Complete Volume}},
  booktitle =	{31st Annual European Symposium on Algorithms (ESA 2023)},
  pages =	{1--1700},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-295-2},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{274},
  editor =	{G{\o}rtz, Inge Li and Farach-Colton, Martin and Puglisi, Simon J. and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2023},
  URN =		{urn:nbn:de:0030-drops-186529},
  doi =		{10.4230/LIPIcs.ESA.2023},
  annote =	{Keywords: LIPIcs, Volume 274, ESA 2023, Complete Volume}
}
Document
Front Matter
Front Matter, Table of Contents, Preface, Conference Organization

Authors: Inge Li Gørtz, Martin Farach-Colton, Simon J. Puglisi, and Grzegorz Herman

Published in: LIPIcs, Volume 274, 31st Annual European Symposium on Algorithms (ESA 2023)


Abstract
Front Matter, Table of Contents, Preface, Conference Organization

Cite as

31st Annual European Symposium on Algorithms (ESA 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 274, pp. 0:i-0:xxii, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)


Copy BibTex To Clipboard

@InProceedings{grtz_et_al:LIPIcs.ESA.2023.0,
  author =	{G{\o}rtz, Inge Li and Farach-Colton, Martin and Puglisi, Simon J. and Herman, Grzegorz},
  title =	{{Front Matter, Table of Contents, Preface, Conference Organization}},
  booktitle =	{31st Annual European Symposium on Algorithms (ESA 2023)},
  pages =	{0:i--0:xxii},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-295-2},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{274},
  editor =	{G{\o}rtz, Inge Li and Farach-Colton, Martin and Puglisi, Simon J. and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2023.0},
  URN =		{urn:nbn:de:0030-drops-186535},
  doi =		{10.4230/LIPIcs.ESA.2023.0},
  annote =	{Keywords: Front Matter, Table of Contents, Preface, Conference Organization}
}
Document
Hierarchical Relative Lempel-Ziv Compression

Authors: Philip Bille, Inge Li Gørtz, Simon J. Puglisi, and Simon R. Tarnow

Published in: LIPIcs, Volume 265, 21st International Symposium on Experimental Algorithms (SEA 2023)


Abstract
Relative Lempel-Ziv (RLZ) parsing is a dictionary compression method in which a string S is compressed relative to a second string R (called the reference) by parsing S into a sequence of substrings that occur in R. RLZ is particularly effective at compressing sets of strings that have a high degree of similarity to the reference string, such as a set of genomes of individuals from the same species. With the now cheap cost of DNA sequencing, such datasets have become extremely abundant and are rapidly growing. In this paper, instead of using a single reference string for the entire collection, we investigate the use of different reference strings for subsets of the collection, with the aim of improving compression. In particular, we propose a new compression scheme hierarchical relative Lempel-Ziv (HRLZ) which form a rooted tree (or hierarchy) on the strings and then compress each string using RLZ with parent as reference, storing only the root of the tree in plain text. To decompress, we traverse the tree in BFS order starting at the root, decompressing children with respect to their parent. We show that this approach leads to a twofold improvement in compression on bacterial genome datasets, with negligible effect on decompression time compared to the standard single reference approach. We show that an effective hierarchy for a given set of strings can be constructed by computing the optimal arborescence of a completed weighted digraph of the strings, with weights as the number of phrases in the RLZ parsing of the source and destination vertices. We further show that instead of computing the complete graph, a sparse graph derived using locality-sensitive hashing can significantly reduce the cost of computing a good hierarchy, without adversely effecting compression performance.

Cite as

Philip Bille, Inge Li Gørtz, Simon J. Puglisi, and Simon R. Tarnow. Hierarchical Relative Lempel-Ziv Compression. In 21st International Symposium on Experimental Algorithms (SEA 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 265, pp. 18:1-18:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)


Copy BibTex To Clipboard

@InProceedings{bille_et_al:LIPIcs.SEA.2023.18,
  author =	{Bille, Philip and G{\o}rtz, Inge Li and Puglisi, Simon J. and Tarnow, Simon R.},
  title =	{{Hierarchical Relative Lempel-Ziv Compression}},
  booktitle =	{21st International Symposium on Experimental Algorithms (SEA 2023)},
  pages =	{18:1--18:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-279-2},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{265},
  editor =	{Georgiadis, Loukas},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.SEA.2023.18},
  URN =		{urn:nbn:de:0030-drops-183680},
  doi =		{10.4230/LIPIcs.SEA.2023.18},
  annote =	{Keywords: Relative compression, Lempel-Ziv compression, RLZ, LZ77, string collections, compressed representation, data structures, efficient algorithms}
}
Document
Sliding Window String Indexing in Streams

Authors: Philip Bille, Johannes Fischer, Inge Li Gørtz, Max Rishøj Pedersen, and Tord Joakim Stordalen

Published in: LIPIcs, Volume 259, 34th Annual Symposium on Combinatorial Pattern Matching (CPM 2023)


Abstract
Given a string S over an alphabet Σ, the string indexing problem is to preprocess S to subsequently support efficient pattern matching queries, that is, given a pattern string P report all the occurrences of P in S. In this paper we study the streaming sliding window string indexing problem. Here the string S arrives as a stream, one character at a time, and the goal is to maintain an index of the last w characters, called the window, for a specified parameter w. At any point in time a pattern matching query for a pattern P may arrive, also streamed one character at a time, and all occurrences of P within the current window must be returned. The streaming sliding window string indexing problem naturally captures scenarios where we want to index the most recent data (i.e. the window) of a stream while supporting efficient pattern matching. Our main result is a simple O(w) space data structure that uses O(log w) time with high probability to process each character from both the input string S and any pattern string P. Reporting each occurrence of P uses additional constant time per reported occurrence. Compared to previous work in similar scenarios this result is the first to achieve an efficient worst-case time per character from the input stream with high probability. We also consider a delayed variant of the problem, where a query may be answered at any point within the next δ characters that arrive from either stream. We present an O(w + δ) space data structure for this problem that improves the above time bounds to O(log (w/δ)). In particular, for a delay of δ = ε w we obtain an O(w) space data structure with constant time processing per character. The key idea to achieve our result is a novel and simple hierarchical structure of suffix trees of independent interest, inspired by the classic log-structured merge trees.

Cite as

Philip Bille, Johannes Fischer, Inge Li Gørtz, Max Rishøj Pedersen, and Tord Joakim Stordalen. Sliding Window String Indexing in Streams. In 34th Annual Symposium on Combinatorial Pattern Matching (CPM 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 259, pp. 4:1-4:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)


Copy BibTex To Clipboard

@InProceedings{bille_et_al:LIPIcs.CPM.2023.4,
  author =	{Bille, Philip and Fischer, Johannes and G{\o}rtz, Inge Li and Pedersen, Max Rish{\o}j and Stordalen, Tord Joakim},
  title =	{{Sliding Window String Indexing in Streams}},
  booktitle =	{34th Annual Symposium on Combinatorial Pattern Matching (CPM 2023)},
  pages =	{4:1--4:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-276-1},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{259},
  editor =	{Bulteau, Laurent and Lipt\'{a}k, Zsuzsanna},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2023.4},
  URN =		{urn:nbn:de:0030-drops-179587},
  doi =		{10.4230/LIPIcs.CPM.2023.4},
  annote =	{Keywords: String indexing, pattern matching, sliding window, streaming}
}
Document
The Fine-Grained Complexity of Episode Matching

Authors: Philip Bille, Inge Li Gørtz, Shay Mozes, Teresa Anna Steiner, and Oren Weimann

Published in: LIPIcs, Volume 223, 33rd Annual Symposium on Combinatorial Pattern Matching (CPM 2022)


Abstract
Given two strings S and P, the Episode Matching problem is to find the shortest substring of S that contains P as a subsequence. The best known upper bound for this problem is Õ(nm) by Das et al. (1997), where n,m are the lengths of S and P, respectively. Although the problem is well studied and has many applications in data mining, this bound has never been improved. In this paper we show why this is the case by proving that no O((nm)^{1-ε}) algorithm (even for binary strings) exists, unless the Strong Exponential Time Hypothesis (SETH) is false. We then consider the indexing version of the problem, where S is preprocessed into a data structure for answering episode matching queries P. We show that for any τ, there is a data structure using O(n+(n/(τ)) ^k) space that answers episode matching queries for any P of length k in O(k⋅ τ ⋅ log log n) time. We complement this upper bound with an almost matching lower bound, showing that any data structure that answers episode matching queries for patterns of length k in time O(n^δ), must use Ω(n^{k-kδ-o(1)}) space, unless the Strong k-Set Disjointness Conjecture is false. Finally, for the special case of k = 2, we present a faster construction of the data structure using fast min-plus multiplication of bounded integer matrices.

Cite as

Philip Bille, Inge Li Gørtz, Shay Mozes, Teresa Anna Steiner, and Oren Weimann. The Fine-Grained Complexity of Episode Matching. In 33rd Annual Symposium on Combinatorial Pattern Matching (CPM 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 223, pp. 4:1-4:12, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)


Copy BibTex To Clipboard

@InProceedings{bille_et_al:LIPIcs.CPM.2022.4,
  author =	{Bille, Philip and G{\o}rtz, Inge Li and Mozes, Shay and Steiner, Teresa Anna and Weimann, Oren},
  title =	{{The Fine-Grained Complexity of Episode Matching}},
  booktitle =	{33rd Annual Symposium on Combinatorial Pattern Matching (CPM 2022)},
  pages =	{4:1--4:12},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-234-1},
  ISSN =	{1868-8969},
  year =	{2022},
  volume =	{223},
  editor =	{Bannai, Hideo and Holub, Jan},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2022.4},
  URN =		{urn:nbn:de:0030-drops-161312},
  doi =		{10.4230/LIPIcs.CPM.2022.4},
  annote =	{Keywords: Pattern matching, fine-grained complexity, longest common subsequence}
}
Document
Predecessor on the Ultra-Wide Word RAM

Authors: Philip Bille, Inge Li Gørtz, and Tord Stordalen

Published in: LIPIcs, Volume 227, 18th Scandinavian Symposium and Workshops on Algorithm Theory (SWAT 2022)


Abstract
We consider the predecessor problem on the ultra-wide word RAM model of computation, which extends the word RAM model with ultrawords consisting of w² bits [TAMC, 2015]. The model supports arithmetic and boolean operations on ultrawords, in addition to scattered memory operations that access or modify w (potentially non-contiguous) memory addresses simultaneously. The ultra-wide word RAM model captures (and idealizes) modern vector processor architectures. Our main result is a simple, linear space data structure that supports predecessor in constant time and updates in amortized, expected constant time. This improves the space of the previous constant time solution that uses space in the order of the size of the universe. Our result is based on a new implementation of the classic x-fast trie data structure of Willard [Inform. Process. Lett. 17(2), 1983] combined with a new dictionary data structure that supports fast parallel lookups.

Cite as

Philip Bille, Inge Li Gørtz, and Tord Stordalen. Predecessor on the Ultra-Wide Word RAM. In 18th Scandinavian Symposium and Workshops on Algorithm Theory (SWAT 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 227, pp. 18:1-18:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)


Copy BibTex To Clipboard

@InProceedings{bille_et_al:LIPIcs.SWAT.2022.18,
  author =	{Bille, Philip and G{\o}rtz, Inge Li and Stordalen, Tord},
  title =	{{Predecessor on the Ultra-Wide Word RAM}},
  booktitle =	{18th Scandinavian Symposium and Workshops on Algorithm Theory (SWAT 2022)},
  pages =	{18:1--18:15},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-236-5},
  ISSN =	{1868-8969},
  year =	{2022},
  volume =	{227},
  editor =	{Czumaj, Artur and Xin, Qin},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.SWAT.2022.18},
  URN =		{urn:nbn:de:0030-drops-161786},
  doi =		{10.4230/LIPIcs.SWAT.2022.18},
  annote =	{Keywords: Ultra-wide word RAM model, predecessor, word-level parallelism}
}
Document
Gapped Indexing for Consecutive Occurrences

Authors: Philip Bille, Inge Li Gørtz, Max Rishøj Pedersen, and Teresa Anna Steiner

Published in: LIPIcs, Volume 191, 32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021)


Abstract
The classic string indexing problem is to preprocess a string S into a compact data structure that supports efficient pattern matching queries. Typical queries include existential queries (decide if the pattern occurs in S), reporting queries (return all positions where the pattern occurs), and counting queries (return the number of occurrences of the pattern). In this paper we consider a variant of string indexing, where the goal is to compactly represent the string such that given two patterns P₁ and P₂ and a gap range [α, β] we can quickly find the consecutive occurrences of P₁ and P₂ with distance in [α, β], i.e., pairs of subsequent occurrences with distance within the range. We present data structures that use Õ(n) space and query time Õ(|P₁|+|P₂|+n^{2/3}) for existence and counting and Õ(|P₁|+|P₂|+n^{2/3}occ^{1/3}) for reporting. We complement this with a conditional lower bound based on the set intersection problem showing that any solution using Õ(n) space must use Ω̃(|P₁| + |P₂| + √n) query time. To obtain our results we develop new techniques and ideas of independent interest including a new suffix tree decomposition and hardness of a variant of the set intersection problem.

Cite as

Philip Bille, Inge Li Gørtz, Max Rishøj Pedersen, and Teresa Anna Steiner. Gapped Indexing for Consecutive Occurrences. In 32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 191, pp. 10:1-10:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{bille_et_al:LIPIcs.CPM.2021.10,
  author =	{Bille, Philip and G{\o}rtz, Inge Li and Pedersen, Max Rish{\o}j and Steiner, Teresa Anna},
  title =	{{Gapped Indexing for Consecutive Occurrences}},
  booktitle =	{32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021)},
  pages =	{10:1--10:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-186-3},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{191},
  editor =	{Gawrychowski, Pawe{\l} and Starikovskaya, Tatiana},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2021.10},
  URN =		{urn:nbn:de:0030-drops-139615},
  doi =		{10.4230/LIPIcs.CPM.2021.10},
  annote =	{Keywords: String indexing, two patterns, consecutive occurrences, conditional lower bound}
}
Document
Random Access in Persistent Strings

Authors: Philip Bille and Inge Li Gørtz

Published in: LIPIcs, Volume 181, 31st International Symposium on Algorithms and Computation (ISAAC 2020)


Abstract
We consider compact representations of collections of similar strings that support random access queries. The collection of strings is given by a rooted tree where edges are labeled by an edit operation (inserting, deleting, or replacing a character) and a node represents the string obtained by applying the sequence of edit operations on the path from the root to the node. The goal is to compactly represent the entire collection while supporting fast random access to any part of a string in the collection. This problem captures natural scenarios such as representing the past history of an edited document or representing highly-repetitive collections. Given a tree with n nodes, we show how to represent the corresponding collection in O(n) space and optimal O(log n/ log log n) query time. This improves the previous time-space trade-offs for the problem. To obtain our results, we introduce new techniques and ideas, including a reduction to a new geometric line segment selection together with an efficient solution.

Cite as

Philip Bille and Inge Li Gørtz. Random Access in Persistent Strings. In 31st International Symposium on Algorithms and Computation (ISAAC 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 181, pp. 48:1-48:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)


Copy BibTex To Clipboard

@InProceedings{bille_et_al:LIPIcs.ISAAC.2020.48,
  author =	{Bille, Philip and G{\o}rtz, Inge Li},
  title =	{{Random Access in Persistent Strings}},
  booktitle =	{31st International Symposium on Algorithms and Computation (ISAAC 2020)},
  pages =	{48:1--48:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-173-3},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{181},
  editor =	{Cao, Yixin and Cheng, Siu-Wing and Li, Minming},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ISAAC.2020.48},
  URN =		{urn:nbn:de:0030-drops-133922},
  doi =		{10.4230/LIPIcs.ISAAC.2020.48},
  annote =	{Keywords: Data compression, data structures, persistent strings}
}
Document
String Indexing for Top-k Close Consecutive Occurrences

Authors: Philip Bille, Inge Li Gørtz, Max Rishøj Pedersen, Eva Rotenberg, and Teresa Anna Steiner

Published in: LIPIcs, Volume 182, 40th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2020)


Abstract
The classic string indexing problem is to preprocess a string S into a compact data structure that supports efficient subsequent pattern matching queries, that is, given a pattern string P, report all occurrences of P within S. In this paper, we study a basic and natural extension of string indexing called the string indexing for top-k close consecutive occurrences problem (Sitcco). Here, a consecutive occurrence is a pair (i,j), i < j, such that P occurs at positions i and j in S and there is no occurrence of P between i and j, and their distance is defined as j-i. Given a pattern P and a parameter k, the goal is to report the top-k consecutive occurrences of P in S of minimal distance. The challenge is to compactly represent S while supporting queries in time close to the length of P and k. We give two time-space trade-offs for the problem. Let n be the length of S, m the length of P, and ε ∈ (0,1]. Our first result achieves O(nlog n) space and optimal query time of O(m+k), and our second result achieves linear space and query time O(m+k^{1+ε}). Along the way, we develop several techniques of independent interest, including a new translation of the problem into a line segment intersection problem and a new recursive clustering technique for trees.

Cite as

Philip Bille, Inge Li Gørtz, Max Rishøj Pedersen, Eva Rotenberg, and Teresa Anna Steiner. String Indexing for Top-k Close Consecutive Occurrences. In 40th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 182, pp. 14:1-14:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)


Copy BibTex To Clipboard

@InProceedings{bille_et_al:LIPIcs.FSTTCS.2020.14,
  author =	{Bille, Philip and G{\o}rtz, Inge Li and Pedersen, Max Rish{\o}j and Rotenberg, Eva and Steiner, Teresa Anna},
  title =	{{String Indexing for Top-k Close Consecutive Occurrences}},
  booktitle =	{40th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2020)},
  pages =	{14:1--14:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-174-0},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{182},
  editor =	{Saxena, Nitin and Simon, Sunil},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.FSTTCS.2020.14},
  URN =		{urn:nbn:de:0030-drops-132558},
  doi =		{10.4230/LIPIcs.FSTTCS.2020.14},
  annote =	{Keywords: String indexing, pattern matching, consecutive occurrences}
}
Document
Track A: Algorithms, Complexity and Games
Space Efficient Construction of Lyndon Arrays in Linear Time

Authors: Philip Bille, Jonas Ellert, Johannes Fischer, Inge Li Gørtz, Florian Kurpicz, J. Ian Munro, and Eva Rotenberg

Published in: LIPIcs, Volume 168, 47th International Colloquium on Automata, Languages, and Programming (ICALP 2020)


Abstract
Given a string S of length n, its Lyndon array identifies for each suffix S[i..n] the next lexicographically smaller suffix S[j..n], i.e. the minimal index j > i with S[i..n] ≻ S[j..n]. Apart from its plain (n log₂ n)-bit array representation, the Lyndon array can also be encoded as a succinct parentheses sequence that requires only 2n bits of space. While linear time construction algorithms for both representations exist, it has previously been unknown if the same time bound can be achieved with less than Ω(n lg n) bits of additional working space. We show that, in fact, o(n) additional bits are sufficient to compute the succinct 2n-bit version of the Lyndon array in linear time. For the plain (n log₂ n)-bit version, we only need 𝒪(1) additional words to achieve linear time. Our space efficient construction algorithm makes the Lyndon array more accessible as a fundamental data structure in applications like full-text indexing.

Cite as

Philip Bille, Jonas Ellert, Johannes Fischer, Inge Li Gørtz, Florian Kurpicz, J. Ian Munro, and Eva Rotenberg. Space Efficient Construction of Lyndon Arrays in Linear Time. In 47th International Colloquium on Automata, Languages, and Programming (ICALP 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 168, pp. 14:1-14:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)


Copy BibTex To Clipboard

@InProceedings{bille_et_al:LIPIcs.ICALP.2020.14,
  author =	{Bille, Philip and Ellert, Jonas and Fischer, Johannes and G{\o}rtz, Inge Li and Kurpicz, Florian and Munro, J. Ian and Rotenberg, Eva},
  title =	{{Space Efficient Construction of Lyndon Arrays in Linear Time}},
  booktitle =	{47th International Colloquium on Automata, Languages, and Programming (ICALP 2020)},
  pages =	{14:1--14:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-138-2},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{168},
  editor =	{Czumaj, Artur and Dawar, Anuj and Merelli, Emanuela},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2020.14},
  URN =		{urn:nbn:de:0030-drops-124211},
  doi =		{10.4230/LIPIcs.ICALP.2020.14},
  annote =	{Keywords: String algorithms, string suffixes, succinct data structures, Lyndon word, Lyndon array, nearest smaller values, nearest smaller suffixes}
}
Document
Complete Volume
LIPIcs, Volume 161, CPM 2020, Complete Volume

Authors: Inge Li Gørtz and Oren Weimann

Published in: LIPIcs, Volume 161, 31st Annual Symposium on Combinatorial Pattern Matching (CPM 2020)


Abstract
LIPIcs, Volume 161, CPM 2020, Complete Volume

Cite as

31st Annual Symposium on Combinatorial Pattern Matching (CPM 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 161, pp. 1-418, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)


Copy BibTex To Clipboard

@Proceedings{grtz_et_al:LIPIcs.CPM.2020,
  title =	{{LIPIcs, Volume 161, CPM 2020, Complete Volume}},
  booktitle =	{31st Annual Symposium on Combinatorial Pattern Matching (CPM 2020)},
  pages =	{1--418},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-149-8},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{161},
  editor =	{G{\o}rtz, Inge Li and Weimann, Oren},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2020},
  URN =		{urn:nbn:de:0030-drops-121245},
  doi =		{10.4230/LIPIcs.CPM.2020},
  annote =	{Keywords: LIPIcs, Volume 161, CPM 2020, Complete Volume}
}
Document
Front Matter
Front Matter, Table of Contents, Preface, Conference Organization

Authors: Inge Li Gørtz and Oren Weimann

Published in: LIPIcs, Volume 161, 31st Annual Symposium on Combinatorial Pattern Matching (CPM 2020)


Abstract
Front Matter, Table of Contents, Preface, Conference Organization

Cite as

31st Annual Symposium on Combinatorial Pattern Matching (CPM 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 161, pp. 0:i-0:xvi, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)


Copy BibTex To Clipboard

@InProceedings{grtz_et_al:LIPIcs.CPM.2020.0,
  author =	{G{\o}rtz, Inge Li and Weimann, Oren},
  title =	{{Front Matter, Table of Contents, Preface, Conference Organization}},
  booktitle =	{31st Annual Symposium on Combinatorial Pattern Matching (CPM 2020)},
  pages =	{0:i--0:xvi},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-149-8},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{161},
  editor =	{G{\o}rtz, Inge Li and Weimann, Oren},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2020.0},
  URN =		{urn:nbn:de:0030-drops-121252},
  doi =		{10.4230/LIPIcs.CPM.2020.0},
  annote =	{Keywords: Front Matter, Table of Contents, Preface, Conference Organization}
}
  • Refine by Author
  • 23 Gørtz, Inge Li
  • 19 Bille, Philip
  • 5 Steiner, Teresa Anna
  • 4 Weimann, Oren
  • 3 Cording, Patrick Hagge
  • Show More...

  • Refine by Classification
  • 10 Theory of computation → Design and analysis of algorithms
  • 8 Theory of computation → Pattern matching
  • 5 Theory of computation → Data structures design and analysis
  • 4 Mathematics of computing → Combinatoric problems
  • 2 Applied computing → Computational biology
  • Show More...

  • Refine by Keyword
  • 5 pattern matching
  • 3 String indexing
  • 3 data structures
  • 2 Conference Organization
  • 2 Front Matter
  • Show More...

  • Refine by Type
  • 25 document
  • 2 volume

  • Refine by Publication Year
  • 7 2020
  • 5 2023
  • 4 2016
  • 4 2017
  • 3 2019
  • Show More...

Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail