DROPS

Volume

LIPIcs, Volume 186

24th International Conference on Database Theory (ICDT 2021)

ICDT 2021, March 23-26, 2021, Nicosia, Cyprus

Editors: Ke Yi and Zhewei Wei

Document

DOI: 10.4230/LIPIcs.CCC.2024.2

Streaming Zero-Knowledge Proofs

Authors: Graham Cormode, Marcel Dall'Agnol, Tom Gur, and Chris Hickey

Published in: LIPIcs, Volume 300, 39th Computational Complexity Conference (CCC 2024)

Abstract

Streaming interactive proofs (SIPs) enable a space-bounded algorithm with one-pass access to a massive stream of data to verify a computation that requires large space, by communicating with a powerful but untrusted prover. This work initiates the study of zero-knowledge proofs for data streams. We define the notion of zero-knowledge in the streaming setting and construct zero-knowledge SIPs for the two main algorithmic building blocks in the streaming interactive proofs literature: the sumcheck and polynomial evaluation protocols. To the best of our knowledge all known streaming interactive proofs are based on either of these tools, and indeed, this allows us to obtain zero-knowledge SIPs for central streaming problems such as index, point and range queries, median, frequency moments, and inner product. Our protocols are efficient in terms of time and space, as well as communication: the verifier algorithm’s space complexity is polylog(n) and, after a non-interactive setup that uses a random string of near-linear length, the remaining parameters are n^o(1). En route, we develop an algorithmic toolkit for designing zero-knowledge data stream protocols, consisting of an algebraic streaming commitment protocol and a temporal commitment protocol. Our analyses rely on delicate algebraic and information-theoretic arguments and reductions from average-case communication complexity.

Cite as

Graham Cormode, Marcel Dall'Agnol, Tom Gur, and Chris Hickey. Streaming Zero-Knowledge Proofs. In 39th Computational Complexity Conference (CCC 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 300, pp. 2:1-2:66, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{cormode_et_al:LIPIcs.CCC.2024.2,
  author =	{Cormode, Graham and Dall'Agnol, Marcel and Gur, Tom and Hickey, Chris},
  title =	{{Streaming Zero-Knowledge Proofs}},
  booktitle =	{39th Computational Complexity Conference (CCC 2024)},
  pages =	{2:1--2:66},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-331-7},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{300},
  editor =	{Santhanam, Rahul},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CCC.2024.2},
  URN =		{urn:nbn:de:0030-drops-203988},
  doi =		{10.4230/LIPIcs.CCC.2024.2},
  annote =	{Keywords: Zero-knowledge proofs, streaming algorithms, computational complexity}
}

Document

DOI: 10.4230/LIPIcs.CCC.2024.36

Gap MCSP Is Not (Levin) NP-Complete in Obfustopia

Authors: Noam Mazor and Rafael Pass

Published in: LIPIcs, Volume 300, 39th Computational Complexity Conference (CCC 2024)

Abstract

We demonstrate that under believable cryptographic hardness assumptions, Gap versions of standard meta-complexity problems, such as the Minimum Circuit Size Problem (MCSP) and the Minimum Time-Bounded Kolmogorov Complexity problem (MKTP) are not NP-complete w.r.t. Levin (i.e., witness-preserving many-to-one) reductions. In more detail: - Assuming the existence of indistinguishability obfuscation, and subexponentially-secure one-way functions, an appropriate Gap version of MCSP is not NP-complete under randomized Levin-reductions. - Assuming the existence of subexponentially-secure indistinguishability obfuscation, subexponentially-secure one-way functions and injective PRGs, an appropriate Gap version of MKTP is not NP-complete under randomized Levin-reductions.

Cite as

Noam Mazor and Rafael Pass. Gap MCSP Is Not (Levin) NP-Complete in Obfustopia. In 39th Computational Complexity Conference (CCC 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 300, pp. 36:1-36:21, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{mazor_et_al:LIPIcs.CCC.2024.36,
  author =	{Mazor, Noam and Pass, Rafael},
  title =	{{Gap MCSP Is Not (Levin) NP-Complete in Obfustopia}},
  booktitle =	{39th Computational Complexity Conference (CCC 2024)},
  pages =	{36:1--36:21},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-331-7},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{300},
  editor =	{Santhanam, Rahul},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CCC.2024.36},
  URN =		{urn:nbn:de:0030-drops-204322},
  doi =		{10.4230/LIPIcs.CCC.2024.36},
  annote =	{Keywords: Kolmogorov complexity, MCSP, Levin Reduction}
}

Document

DOI: 10.4230/LIPIcs.SEA.2024.9

Top- k Frequent Patterns in Streams and Parameterized-Space LZ Compression

Authors: Patrick Dinklage, Johnnes Fischer, and Nicola Prezza

Published in: LIPIcs, Volume 301, 22nd International Symposium on Experimental Algorithms (SEA 2024)

Abstract

We present novel online approximations of the Lempel-Ziv 77 (LZ77) and Lempel-Ziv 78 (LZ78) compression schemes [Lempel & Ziv, 1977/1978] with parameterizable space usage based on estimating which k patterns occur the most frequently in the streamed input for parameter k. This new approach overcomes the issue of finding only local repetitions, which is a natural limitation of algorithms that compress using a sliding window or by partitioning the input into blocks. For this, we introduce the top-k trie, a summary for maintaining online the top-k frequent consecutive patterns in a stream of characters based on a combination of the Lempel-Ziv 78 compression scheme and the Misra-Gries algorithm for frequent item estimation in streams. Using straightforward encoding, our implementations yield compression ratios (output over input size) competitive with established general-purpose LZ-based compression utilities such as gzip or xz.

Cite as

Patrick Dinklage, Johnnes Fischer, and Nicola Prezza. Top- k Frequent Patterns in Streams and Parameterized-Space LZ Compression. In 22nd International Symposium on Experimental Algorithms (SEA 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 301, pp. 9:1-9:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{dinklage_et_al:LIPIcs.SEA.2024.9,
  author =	{Dinklage, Patrick and Fischer, Johnnes and Prezza, Nicola},
  title =	{{Top- k Frequent Patterns in Streams and Parameterized-Space LZ Compression}},
  booktitle =	{22nd International Symposium on Experimental Algorithms (SEA 2024)},
  pages =	{9:1--9:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-325-6},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{301},
  editor =	{Liberti, Leo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SEA.2024.9},
  URN =		{urn:nbn:de:0030-drops-203748},
  doi =		{10.4230/LIPIcs.SEA.2024.9},
  annote =	{Keywords: compression, streaming, heavy hitters, algorithm engineering}
}

Document

DOI: 10.4230/LIPIcs.ECRTS.2024.1

JuMP2start: Time-Aware Stop-Start Technology for a Software-Defined Vehicle System

Authors: Anam Farrukh and Richard West

Published in: LIPIcs, Volume 298, 36th Euromicro Conference on Real-Time Systems (ECRTS 2024)

Abstract

Software-defined vehicle (SDV) systems replace traditional ECU architectures with software tasks running on centralized multicore processors in automotive-grade PCs. However, PC boot delays to cold-start an integrated vehicle management system (VMS) are problematic for time-critical functions, which must process sensor and actuator data within specific time bounds. To tackle this challenge, we present JuMP2start: a time-aware multicore stop-start approach for SDVs. JuMP2start leverages PC-class suspend-to-RAM techniques to capture a system snapshot when the vehicle is stopped. Upon restart, critical services are resumed-from-RAM within order of milliseconds compared to normal cold-start times. This work showcases how JuMP2start manages global suspension and resumption mechanisms for a state-of-the-art dual-domain vehicle management system comprising real-time OS (RTOS) and Linux SMP guests. JuMP2start models automotive tasks as continuable or restartable to ensure timing- and safety-critical function pipelines are reactively resumed with low latency, while discarding stale task state. Experiments with the VMS show that critical CAN traffic processing resumes within 500 milliseconds of waking the RTOS guest, and reaches steady-state throughput in under 7ms.

Cite as

Anam Farrukh and Richard West. JuMP2start: Time-Aware Stop-Start Technology for a Software-Defined Vehicle System. In 36th Euromicro Conference on Real-Time Systems (ECRTS 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 298, pp. 1:1-1:27, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{farrukh_et_al:LIPIcs.ECRTS.2024.1,
  author =	{Farrukh, Anam and West, Richard},
  title =	{{JuMP2start: Time-Aware Stop-Start Technology for a Software-Defined Vehicle System}},
  booktitle =	{36th Euromicro Conference on Real-Time Systems (ECRTS 2024)},
  pages =	{1:1--1:27},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-324-9},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{298},
  editor =	{Pellizzoni, Rodolfo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ECRTS.2024.1},
  URN =		{urn:nbn:de:0030-drops-203046},
  doi =		{10.4230/LIPIcs.ECRTS.2024.1},
  annote =	{Keywords: Time-aware stop-start, Real-time power management, Suspend-to-RAM, Partitioning hypervisor, Vehicle management system, Vehicle-OS, Software-defined vehicles (SDV)}
}

Document

DOI: 10.4230/LIPIcs.SoCG.2021.45

Approximate Range Counting Under Differential Privacy

Authors: Ziyue Huang and Ke Yi

Published in: LIPIcs, Volume 189, 37th International Symposium on Computational Geometry (SoCG 2021)

Abstract

Range counting under differential privacy has been studied extensively. Unfortunately, lower bounds based on discrepancy theory suggest that large errors have to be introduced in order to preserve privacy: Essentially for any range space (except axis-parallel rectangles), the error has to be polynomial. In this paper, we show that by allowing a standard notion of geometric approximation where points near the boundary of the range may or may not be counted, the error can be reduced to logarithmic. Furthermore, our approximate range counting data structure can be used to solve the approximate nearest neighbor (ANN) problem and k-NN classification, leading to the first differentially private algorithms for these two problems with provable guarantees on the utility.

Cite as

Ziyue Huang and Ke Yi. Approximate Range Counting Under Differential Privacy. In 37th International Symposium on Computational Geometry (SoCG 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 189, pp. 45:1-45:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)

Copy BibTex To Clipboard

@InProceedings{huang_et_al:LIPIcs.SoCG.2021.45,
  author =	{Huang, Ziyue and Yi, Ke},
  title =	{{Approximate Range Counting Under Differential Privacy}},
  booktitle =	{37th International Symposium on Computational Geometry (SoCG 2021)},
  pages =	{45:1--45:14},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-184-9},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{189},
  editor =	{Buchin, Kevin and Colin de Verdi\`{e}re, \'{E}ric},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SoCG.2021.45},
  URN =		{urn:nbn:de:0030-drops-138441},
  doi =		{10.4230/LIPIcs.SoCG.2021.45},
  annote =	{Keywords: Differential Privacy, Approximate Range Counting}
}

Document

Complete Volume

DOI: 10.4230/LIPIcs.ICDT.2021

LIPIcs, Volume 186, ICDT 2021, Complete Volume

Authors: Ke Yi and Zhewei Wei

Published in: LIPIcs, Volume 186, 24th International Conference on Database Theory (ICDT 2021)

Abstract

LIPIcs, Volume 186, ICDT 2021, Complete Volume

Cite as

24th International Conference on Database Theory (ICDT 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 186, pp. 1-438, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)

Copy BibTex To Clipboard

@Proceedings{yi_et_al:LIPIcs.ICDT.2021,
  title =	{{LIPIcs, Volume 186, ICDT 2021, Complete Volume}},
  booktitle =	{24th International Conference on Database Theory (ICDT 2021)},
  pages =	{1--438},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-179-5},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{186},
  editor =	{Yi, Ke and Wei, Zhewei},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2021},
  URN =		{urn:nbn:de:0030-drops-137076},
  doi =		{10.4230/LIPIcs.ICDT.2021},
  annote =	{Keywords: LIPIcs, Volume 186, ICDT 2021, Complete Volume}
}

Document

Front Matter

DOI: 10.4230/LIPIcs.ICDT.2021.0

Front Matter, Table of Contents, Preface, Conference Organization

Authors: Ke Yi and Zhewei Wei

Published in: LIPIcs, Volume 186, 24th International Conference on Database Theory (ICDT 2021)

Abstract

Front Matter, Table of Contents, Preface, Conference Organization

Cite as

24th International Conference on Database Theory (ICDT 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 186, pp. 0:i-0:xvi, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)

Copy BibTex To Clipboard

@InProceedings{yi_et_al:LIPIcs.ICDT.2021.0,
  author =	{Yi, Ke and Wei, Zhewei},
  title =	{{Front Matter, Table of Contents, Preface, Conference Organization}},
  booktitle =	{24th International Conference on Database Theory (ICDT 2021)},
  pages =	{0:i--0:xvi},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-179-5},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{186},
  editor =	{Yi, Ke and Wei, Zhewei},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2021.0},
  URN =		{urn:nbn:de:0030-drops-137086},
  doi =		{10.4230/LIPIcs.ICDT.2021.0},
  annote =	{Keywords: Front Matter, Table of Contents, Preface, Conference Organization}
}

Document

Invited Talk

DOI: 10.4230/LIPIcs.ICDT.2021.1

Explainability Queries for ML Models and its Connections with Data Management Problems (Invited Talk)

Authors: Pablo Barceló

Published in: LIPIcs, Volume 186, 24th International Conference on Database Theory (ICDT 2021)

Abstract

In this talk I will present two recent examples of my research on explainability problems over machine learning (ML) models. In rough terms, these explainability problems deal with specific queries one poses over a ML model in order to obtain meaningful justifications for their results. Both of the examples I will present deal with “local” and “post-hoc” explainability queries. Here “local” means that we intend to explain the output of the ML model for a particular input, while “post-hoc” refers to the fact that the explanation is obtained after the model is trained. In the process I will also establish connections with problems studied in data management. This with the intention of suggesting new possibilities for cross-fertilization between the area and ML. The first example I will present refers to computing explanations with scores based on Shapley values, in particular with the recently proposed, and already influential, SHAP-score. This score provides a measure of how different features in the input contribute to the output of the ML model. We provide a detailed analysis of the complexity of this problem for different classes of Boolean circuits. In particular, we show that the problem of computing SHAP-scores is tractable as long as the circuit is deterministic and decomposable, but becomes computationally hard if any of these restrictions is lifted. The tractability part of this result provides a generalization of a recent result stating that, for Boolean hierarchical conjunctive queries, the Shapley-value of the contribution of a tuple in the database to the final result can be computed in polynomial time. The second example I will present refers to the comparison of different ML models in terms of important families of (local and post-hoc) explainability queries. For the models, I will consider multi-layer perceptrons and binary decision diagrams. The main object of study will be the computational complexity of the aforementioned queries over such models. The obtained results will show an interesting theoretical counterpart to wisdom’s claims on interpretability. This work also suggests the need for developing query languages that support the process of retrieving explanations from ML models, and also for obtaining general tractability results for such languages over specific classes of models.

Cite as

Pablo Barceló. Explainability Queries for ML Models and its Connections with Data Management Problems (Invited Talk). In 24th International Conference on Database Theory (ICDT 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 186, p. 1:1, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)

Copy BibTex To Clipboard

@InProceedings{barcelo:LIPIcs.ICDT.2021.1,
  author =	{Barcel\'{o}, Pablo},
  title =	{{Explainability Queries for ML Models and its Connections with Data Management Problems}},
  booktitle =	{24th International Conference on Database Theory (ICDT 2021)},
  pages =	{1:1--1:1},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-179-5},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{186},
  editor =	{Yi, Ke and Wei, Zhewei},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2021.1},
  URN =		{urn:nbn:de:0030-drops-137091},
  doi =		{10.4230/LIPIcs.ICDT.2021.1},
  annote =	{Keywords: ML models, Explainability, Shapley values, decision trees, OBDDs, deterministic and decomposable Boolean circuits}
}

Document

Invited Talk

DOI: 10.4230/LIPIcs.ICDT.2021.2

Comparing Apples and Oranges: Fairness and Diversity in Ranking (Invited Talk)

Authors: Julia Stoyanovich

Published in: LIPIcs, Volume 186, 24th International Conference on Database Theory (ICDT 2021)

Abstract

Algorithmic rankers take a collection of candidates as input and produce a ranking (permutation) of the candidates as output. The simplest kind of ranker is score-based; it computes a score of each candidate independently and returns the candidates in score order. Another common kind of ranker is learning-to-rank, where supervised learning is used to predict the ranking of unseen candidates. For both kinds of rankers, we may output the entire permutation or only the highest scoring k candidates, the top-k. Set selection is a special case of ranking that ignores the relative order among the top-k. In the past few years, there has been much work on incorporating fairness and diversity requirements into algorithmic rankers, with contributions coming from the data management, algorithms, information retrieval, and recommender systems communities. In my talk I will offer a broad perspective that connects formalizations and algorithmic approaches across subfields, grounding them in a common narrative around the value frameworks that motivate specific fairness- and diversity-enhancing interventions. I will discuss some recent and ongoing work, and will outline future research directions where the data management community is well-positioned to make lasting impact, especially if we attack these problems with our rich theory-meets-systems toolkit.

Cite as

Julia Stoyanovich. Comparing Apples and Oranges: Fairness and Diversity in Ranking (Invited Talk). In 24th International Conference on Database Theory (ICDT 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 186, p. 2:1, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)

Copy BibTex To Clipboard

@InProceedings{stoyanovich:LIPIcs.ICDT.2021.2,
  author =	{Stoyanovich, Julia},
  title =	{{Comparing Apples and Oranges: Fairness and Diversity in Ranking}},
  booktitle =	{24th International Conference on Database Theory (ICDT 2021)},
  pages =	{2:1--2:1},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-179-5},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{186},
  editor =	{Yi, Ke and Wei, Zhewei},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2021.2},
  URN =		{urn:nbn:de:0030-drops-137104},
  doi =		{10.4230/LIPIcs.ICDT.2021.2},
  annote =	{Keywords: fairness, diversity, ranking, set selection, responsible data management}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2021.3

Box Covers and Domain Orderings for Beyond Worst-Case Join Processing

Authors: Kaleb Alway, Eric Blais, and Semih Salihoglu

Published in: LIPIcs, Volume 186, 24th International Conference on Database Theory (ICDT 2021)

Abstract

Recent beyond worst-case optimal join algorithms Minesweeper and its generalization Tetris have brought the theory of indexing and join processing together by developing a geometric framework for joins. These algorithms take as input an index ℬ, referred to as a box cover, that stores output gaps that can be inferred from traditional indexes, such as B+ trees or tries, on the input relations. The performances of these algorithms highly depend on the certificate of ℬ, which is the smallest subset of gaps in ℬ whose union covers all of the gaps in the output space of a query Q. Different box covers can have different size certificates and the sizes of both the box covers and certificates highly depend on the ordering of the domain values of the attributes in Q. We study how to generate box covers that contain small size certificates to guarantee efficient runtimes for these algorithms. First, given a query Q over a set of relations of size N and a fixed set of domain orderings for the attributes, we give a Õ(N)-time algorithm called GAMB which generates a box cover for Q that is guaranteed to contain the smallest size certificate across any box cover for Q. Second, we show that finding a domain ordering to minimize the box cover size and certificate is NP-hard through a reduction from the 2 consecutive block minimization problem on boolean matrices. Our third contribution is a Õ(N)-time approximation algorithm called ADORA to compute domain orderings, under which one can compute a box cover of size Õ(K^r), where K is the minimum box cover for Q under any domain ordering and r is the maximum arity of any relation. This guarantees certificates of size Õ(K^r). We combine ADORA and GAMB with Tetris to form a new algorithm we call TetrisReordered, which provides several new beyond worst-case bounds. On infinite families of queries, TetrisReordered’s runtimes are unboundedly better than the bounds stated in prior work.

Cite as

Kaleb Alway, Eric Blais, and Semih Salihoglu. Box Covers and Domain Orderings for Beyond Worst-Case Join Processing. In 24th International Conference on Database Theory (ICDT 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 186, pp. 3:1-3:23, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)

Copy BibTex To Clipboard

@InProceedings{alway_et_al:LIPIcs.ICDT.2021.3,
  author =	{Alway, Kaleb and Blais, Eric and Salihoglu, Semih},
  title =	{{Box Covers and Domain Orderings for Beyond Worst-Case Join Processing}},
  booktitle =	{24th International Conference on Database Theory (ICDT 2021)},
  pages =	{3:1--3:23},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-179-5},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{186},
  editor =	{Yi, Ke and Wei, Zhewei},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2021.3},
  URN =		{urn:nbn:de:0030-drops-137114},
  doi =		{10.4230/LIPIcs.ICDT.2021.3},
  annote =	{Keywords: Beyond worst-case join algorithms, Tetris, Box covers, Domain orderings}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2021.4

A Purely Regular Approach to Non-Regular Core Spanners

Authors: Markus L. Schmid and Nicole Schweikardt

Published in: LIPIcs, Volume 186, 24th International Conference on Database Theory (ICDT 2021)

Abstract

The regular spanners (characterised by vset-automata) are closed under the algebraic operations of union, join and projection, and have desirable algorithmic properties. The core spanners (introduced by Fagin, Kimelfeld, Reiss, and Vansummeren (PODS 2013, JACM 2015) as a formalisation of the core functionality of the query language AQL used in IBM’s SystemT) additionally need string equality selections and it has been shown by Freydenberger and Holldack (ICDT 2016, Theory of Computing Systems 2018) that this leads to high complexity and even undecidability of the typical problems in static analysis and query evaluation. We propose an alternative approach to core spanners: by incorporating the string-equality selections directly into the regular language that represents the underlying regular spanner (instead of treating it as an algebraic operation on the table extracted by the regular spanner), we obtain a fragment of core spanners that, while having slightly weaker expressive power than the full class of core spanners, arguably still covers the intuitive applications of string equality selections for information extraction and has much better upper complexity bounds of the typical problems in static analysis and query evaluation.

Cite as

Markus L. Schmid and Nicole Schweikardt. A Purely Regular Approach to Non-Regular Core Spanners. In 24th International Conference on Database Theory (ICDT 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 186, pp. 4:1-4:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)

Copy BibTex To Clipboard

@InProceedings{schmid_et_al:LIPIcs.ICDT.2021.4,
  author =	{Schmid, Markus L. and Schweikardt, Nicole},
  title =	{{A Purely Regular Approach to Non-Regular Core Spanners}},
  booktitle =	{24th International Conference on Database Theory (ICDT 2021)},
  pages =	{4:1--4:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-179-5},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{186},
  editor =	{Yi, Ke and Wei, Zhewei},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2021.4},
  URN =		{urn:nbn:de:0030-drops-137124},
  doi =		{10.4230/LIPIcs.ICDT.2021.4},
  annote =	{Keywords: Document spanners, regular expressions with backreferences}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2021.5

Ranked Enumeration of Conjunctive Query Results

Authors: Shaleen Deep and Paraschos Koutris

Published in: LIPIcs, Volume 186, 24th International Conference on Database Theory (ICDT 2021)

Abstract

We study the problem of enumerating answers of Conjunctive Queries ranked according to a given ranking function. Our main contribution is a novel algorithm with small preprocessing time, logarithmic delay, and non-trivial space usage during execution. To allow for efficient enumeration, we exploit certain properties of ranking functions that frequently occur in practice. To this end, we introduce the notions of decomposable and compatible (w.r.t. a query decomposition) ranking functions, which allow for partial aggregation of tuple scores in order to efficiently enumerate the output. We complement the algorithmic results with lower bounds that justify why restrictions on the structure of ranking functions are necessary. Our results extend and improve upon a long line of work that has studied ranked enumeration from both a theoretical and practical perspective.

Cite as

Shaleen Deep and Paraschos Koutris. Ranked Enumeration of Conjunctive Query Results. In 24th International Conference on Database Theory (ICDT 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 186, pp. 5:1-5:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)

Copy BibTex To Clipboard

@InProceedings{deep_et_al:LIPIcs.ICDT.2021.5,
  author =	{Deep, Shaleen and Koutris, Paraschos},
  title =	{{Ranked Enumeration of Conjunctive Query Results}},
  booktitle =	{24th International Conference on Database Theory (ICDT 2021)},
  pages =	{5:1--5:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-179-5},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{186},
  editor =	{Yi, Ke and Wei, Zhewei},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2021.5},
  URN =		{urn:nbn:de:0030-drops-137139},
  doi =		{10.4230/LIPIcs.ICDT.2021.5},
  annote =	{Keywords: Query result enumeration, joins, ranking}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2021.6

Towards Optimal Dynamic Indexes for Approximate (and Exact) Triangle Counting

Authors: Shangqi Lu and Yufei Tao

Published in: LIPIcs, Volume 186, 24th International Conference on Database Theory (ICDT 2021)

Abstract

In ICDT'19, Kara, Ngo, Nikolic, Olteanu, and Zhang gave a structure which maintains the number T of triangles in an undirected graph G = (V, E) along with the edge insertions/deletions in G. Using O(m) space (m = |E|), their structure supports an update in O(√m log m) amortized time which is optimal (up to polylog factors) subject to the OMv-conjecture (Henzinger, Krinninger, Nanongkai, and Saranurak, STOC'15). Aiming to improve the update efficiency, we study: - the optimal tradeoff between update time and approximation quality. We require a structure to provide the (ε, Γ)-guarantee: when queried, it should return an estimate t of T that has relative error at most ε if T ≥ Γ, or an absolute error at most ε ⋅ Γ, otherwise. We prove that, under any ε ≤ 0.49 and subject to the OMv-conjecture, no structure can guarantee O(m^{0.5-δ}/Γ) expected amortized update time and O(m^{2/3-δ}) query time simultaneously for any constant δ > 0; this is true for Γ = m^c of any constant c in [0, 1/2). We match the lower bound with a structure that ensures Õ((1/ε)³ ⋅ √m/Γ) amortized update time with high probability, and O(1) query time. - (for exact counting) how to achieve arboricity-sensitive update time. For any 1 ≤ Γ ≤ √m, we describe a structure of O(min{α m + m log m, (m/Γ)²}) space that maintains T precisely, and supports an update in Õ(min{α + Γ, √m}) amortized time, where α is the largest arboricity of G in history (and does not need to be known). Our structure reconstructs the aforementioned ICDT'19 result up to polylog factors by setting Γ = √m, but achieves Õ(m^{0.5-δ}) update time as long as α = O(m^{0.5-δ}).

Cite as

Shangqi Lu and Yufei Tao. Towards Optimal Dynamic Indexes for Approximate (and Exact) Triangle Counting. In 24th International Conference on Database Theory (ICDT 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 186, pp. 6:1-6:23, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)

Copy BibTex To Clipboard

@InProceedings{lu_et_al:LIPIcs.ICDT.2021.6,
  author =	{Lu, Shangqi and Tao, Yufei},
  title =	{{Towards Optimal Dynamic Indexes for Approximate (and Exact) Triangle Counting}},
  booktitle =	{24th International Conference on Database Theory (ICDT 2021)},
  pages =	{6:1--6:23},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-179-5},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{186},
  editor =	{Yi, Ke and Wei, Zhewei},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2021.6},
  URN =		{urn:nbn:de:0030-drops-137146},
  doi =		{10.4230/LIPIcs.ICDT.2021.6},
  annote =	{Keywords: Triangle Counting, Data Structures, Lower Bounds, Graph Algorithms}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2021.7

Grammars for Document Spanners

Authors: Liat Peterfreund

Published in: LIPIcs, Volume 186, 24th International Conference on Database Theory (ICDT 2021)

Abstract

We propose a new grammar-based language for defining information-extractors from documents (text) that is built upon the well-studied framework of document spanners for extracting structured data from text. While previously studied formalisms for document spanners are mainly based on regular expressions, we use an extension of context-free grammars, called {extraction grammars}, to define the new class of context-free spanners. Extraction grammars are simply context-free grammars extended with variables that capture interval positions of the document, namely spans. While regular expressions are efficient for tokenizing and tagging, context-free grammars are also efficient for capturing structural properties. Indeed, we show that context-free spanners are strictly more expressive than their regular counterparts. We reason about the expressive power of our new class and present a pushdown-automata model that captures it. We show that extraction grammars can be evaluated with polynomial data complexity. Nevertheless, as the degree of the polynomial depends on the query, we present an enumeration algorithm for unambiguous extraction grammars that, after quintic preprocessing, outputs the results sequentially, without repetitions, with a constant delay between every two consecutive ones.

Cite as

Liat Peterfreund. Grammars for Document Spanners. In 24th International Conference on Database Theory (ICDT 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 186, pp. 7:1-7:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)

Copy BibTex To Clipboard

@InProceedings{peterfreund:LIPIcs.ICDT.2021.7,
  author =	{Peterfreund, Liat},
  title =	{{Grammars for Document Spanners}},
  booktitle =	{24th International Conference on Database Theory (ICDT 2021)},
  pages =	{7:1--7:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-179-5},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{186},
  editor =	{Yi, Ke and Wei, Zhewei},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2021.7},
  URN =		{urn:nbn:de:0030-drops-137154},
  doi =		{10.4230/LIPIcs.ICDT.2021.7},
  annote =	{Keywords: Information Extraction, Document Spanners, Context-Free Grammars, Constant-Delay Enumeration, Regular Expressions, Pushdown Automata}
}

35 Search Results for "Yi, Ke"

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as