DROPS

Document

DOI: 10.4230/LIPIcs.ESA.2021.60

Telescoping Filter: A Practical Adaptive Filter

Authors: David J. Lee, Samuel McCauley, Shikha Singh, and Max Stein

Published in: LIPIcs, Volume 204, 29th Annual European Symposium on Algorithms (ESA 2021)

Abstract

Filters are small, fast, and approximate set membership data structures. They are often used to filter out expensive accesses to a remote set S for negative queries (that is, filtering out queries x ∉ S). Filters have one-sided errors: on a negative query, a filter may say "present" with a tunable false-positive probability of ε. Correctness is traded for space: filters only use log (1/ε) + O(1) bits per element. The false-positive guarantees of most filters, however, hold only for a single query. In particular, if x is a false positive, a subsequent query to x is a false positive with probability 1, not ε. With this in mind, recent work has introduced the notion of an adaptive filter. A filter is adaptive if each query is a false positive with probability ε, regardless of answers to previous queries. This requires "fixing" false positives as they occur. Adaptive filters not only provide strong false positive guarantees in adversarial environments but also improve query performance on practical workloads by eliminating repeated false positives. Existing work on adaptive filters falls into two categories. On the one hand, there are practical filters, based on the cuckoo filter, that attempt to fix false positives heuristically without meeting the adaptivity guarantee. On the other hand, the broom filter is a very complex adaptive filter that meets the optimal theoretical bounds. In this paper, we bridge this gap by designing the telescoping adaptive filter (TAF), a practical, provably adaptive filter. We provide theoretical false-positive and space guarantees for our filter, along with empirical results where we compare its performance against state-of-the-art filters. We also implement the broom filter and compare it to the TAF. Our experiments show that theoretical adaptivity can lead to improved false-positive performance on practical inputs, and can be achieved while maintaining throughput that is similar to non-adaptive filters.

Cite as

David J. Lee, Samuel McCauley, Shikha Singh, and Max Stein. Telescoping Filter: A Practical Adaptive Filter. In 29th Annual European Symposium on Algorithms (ESA 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 204, pp. 60:1-60:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)

Copy BibTex To Clipboard

@InProceedings{lee_et_al:LIPIcs.ESA.2021.60,
  author =	{Lee, David J. and McCauley, Samuel and Singh, Shikha and Stein, Max},
  title =	{{Telescoping Filter: A Practical Adaptive Filter}},
  booktitle =	{29th Annual European Symposium on Algorithms (ESA 2021)},
  pages =	{60:1--60:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-204-4},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{204},
  editor =	{Mutzel, Petra and Pagh, Rasmus and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2021.60},
  URN =		{urn:nbn:de:0030-drops-146410},
  doi =		{10.4230/LIPIcs.ESA.2021.60},
  annote =	{Keywords: Filters, approximate-membership query data structures (AMQs), Bloom filters, quotient filters, cuckoo filters, adaptivity, succinct data structures}
}

Document

DOI: 10.4230/LIPIcs.ITCS.2021.6

Simple Heuristics Yield Provable Algorithms for Masked Low-Rank Approximation

Authors: Cameron Musco, Christopher Musco, and David P. Woodruff

Published in: LIPIcs, Volume 185, 12th Innovations in Theoretical Computer Science Conference (ITCS 2021)

Abstract

In the masked low-rank approximation problem, one is given data matrix A ∈ ℝ^{n × n} and binary mask matrix W ∈ {0,1}^{n × n}. The goal is to find a rank-k matrix L for which: cost(L) := ∑_{i=1}^n ∑_{j=1}^n W_{i,j} ⋅ (A_{i,j} - L_{i,j})² ≤ OPT + ε ‖A‖_F², where OPT = min_{rank-k L̂} cost(L̂) and ε is a given error parameter. Depending on the choice of W, the above problem captures factor analysis, low-rank plus diagonal decomposition, robust PCA, low-rank matrix completion, low-rank plus block matrix approximation, low-rank recovery from monotone missing data, and a number of other important problems. Many of these problems are NP-hard, and while algorithms with provable guarantees are known in some cases, they either 1) run in time n^Ω(k²/ε) or 2) make strong assumptions, for example, that A is incoherent or that the entries in W are chosen independently and uniformly at random. In this work, we show that a common polynomial time heuristic, which simply sets A to 0 where W is 0, and then finds a standard low-rank approximation, yields bicriteria approximation guarantees for this problem. In particular, for rank k' > k depending on the public coin partition number of W, the heuristic outputs rank-k' L with cost(L) ≤ OPT + ε ‖A‖_F². This partition number is in turn bounded by the randomized communication complexity of W, when interpreted as a two-player communication matrix. For many important cases, including all those listed above, this yields bicriteria approximation guarantees with rank k' = k ⋅ poly(log n/ε). Beyond this result, we show that different notions of communication complexity yield bicriteria algorithms for natural variants of masked low-rank approximation. For example, multi-player number-in-hand communication complexity connects to masked tensor decomposition and non-deterministic communication complexity to masked Boolean low-rank factorization.

Cite as

Cameron Musco, Christopher Musco, and David P. Woodruff. Simple Heuristics Yield Provable Algorithms for Masked Low-Rank Approximation. In 12th Innovations in Theoretical Computer Science Conference (ITCS 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 185, pp. 6:1-6:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)

Copy BibTex To Clipboard

@InProceedings{musco_et_al:LIPIcs.ITCS.2021.6,
  author =	{Musco, Cameron and Musco, Christopher and Woodruff, David P.},
  title =	{{Simple Heuristics Yield Provable Algorithms for Masked Low-Rank Approximation}},
  booktitle =	{12th Innovations in Theoretical Computer Science Conference (ITCS 2021)},
  pages =	{6:1--6:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-177-1},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{185},
  editor =	{Lee, James R.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2021.6},
  URN =		{urn:nbn:de:0030-drops-135452},
  doi =		{10.4230/LIPIcs.ITCS.2021.6},
  annote =	{Keywords: low-rank approximation, communication complexity, weighted low-rank approximation, bicriteria approximation algorithms}
}

Document

DOI: 10.4230/LIPIcs.ITCS.2021.80

Erasure-Resilient Sublinear-Time Graph Algorithms

Authors: Amit Levi, Ramesh Krishnan S. Pallavoor, Sofya Raskhodnikova, and Nithin Varma

Published in: LIPIcs, Volume 185, 12th Innovations in Theoretical Computer Science Conference (ITCS 2021)

Abstract

We investigate sublinear-time algorithms that take partially erased graphs represented by adjacency lists as input. Our algorithms make degree and neighbor queries to the input graph and work with a specified fraction of adversarial erasures in adjacency entries. We focus on two computational tasks: testing if a graph is connected or ε-far from connected and estimating the average degree. For testing connectedness, we discover a threshold phenomenon: when the fraction of erasures is less than ε, this property can be tested efficiently (in time independent of the size of the graph); when the fraction of erasures is at least ε, then a number of queries linear in the size of the graph representation is required. Our erasure-resilient algorithm (for the special case with no erasures) is an improvement over the previously known algorithm for connectedness in the standard property testing model and has optimal dependence on the proximity parameter ε. For estimating the average degree, our results provide an "interpolation" between the query complexity for this computational task in the model with no erasures in two different settings: with only degree queries, investigated by Feige (SIAM J. Comput. `06), and with degree queries and neighbor queries, investigated by Goldreich and Ron (Random Struct. Algorithms `08) and Eden et al. (ICALP `17). We conclude with a discussion of our model and open questions raised by our work.

Cite as

Amit Levi, Ramesh Krishnan S. Pallavoor, Sofya Raskhodnikova, and Nithin Varma. Erasure-Resilient Sublinear-Time Graph Algorithms. In 12th Innovations in Theoretical Computer Science Conference (ITCS 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 185, pp. 80:1-80:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)

Copy BibTex To Clipboard

@InProceedings{levi_et_al:LIPIcs.ITCS.2021.80,
  author =	{Levi, Amit and Pallavoor, Ramesh Krishnan S. and Raskhodnikova, Sofya and Varma, Nithin},
  title =	{{Erasure-Resilient Sublinear-Time Graph Algorithms}},
  booktitle =	{12th Innovations in Theoretical Computer Science Conference (ITCS 2021)},
  pages =	{80:1--80:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-177-1},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{185},
  editor =	{Lee, James R.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2021.80},
  URN =		{urn:nbn:de:0030-drops-136192},
  doi =		{10.4230/LIPIcs.ITCS.2021.80},
  annote =	{Keywords: Graph property testing, Computing with incomplete information, Approximating graph parameters}
}

Document

DOI: 10.4230/LIPIcs.DNA.2020.9

scadnano: A Browser-Based, Scriptable Tool for Designing DNA Nanostructures

Authors: David Doty, Benjamin L Lee, and Tristan Stérin

Published in: LIPIcs, Volume 174, 26th International Conference on DNA Computing and Molecular Programming (DNA 26) (2020)

Abstract

We introduce scadnano (short for "scriptable cadnano"), a computational tool for designing synthetic DNA structures. Its design is based heavily on cadnano [Douglas et al., 2009], the most widely-used software for designing DNA origami [Paul W. K. Rothemund, 2006], with three main differences: 1) scadnano runs entirely in the browser, with no software installation required. 2) scadnano designs, while they can be edited manually, can also be created and edited by a well-documented Python scripting library, to help automate tedious tasks. 3) The scadnano file format is easily human-readable. This goal is closely aligned with the scripting library, intended to be helpful when debugging scripts or interfacing with other software. The format is also somewhat more expressive than that of cadnano, able to describe a broader range of DNA structures than just DNA origami.

Cite as

David Doty, Benjamin L Lee, and Tristan Stérin. scadnano: A Browser-Based, Scriptable Tool for Designing DNA Nanostructures. In 26th International Conference on DNA Computing and Molecular Programming (DNA 26). Leibniz International Proceedings in Informatics (LIPIcs), Volume 174, pp. 9:1-9:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)

Copy BibTex To Clipboard

@InProceedings{doty_et_al:LIPIcs.DNA.2020.9,
  author =	{Doty, David and Lee, Benjamin L and St\'{e}rin, Tristan},
  title =	{{scadnano: A Browser-Based, Scriptable Tool for Designing DNA Nanostructures}},
  booktitle =	{26th International Conference on DNA Computing and Molecular Programming (DNA 26)},
  pages =	{9:1--9:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-163-4},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{174},
  editor =	{Geary, Cody and Patitz, Matthew J.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.DNA.2020.9},
  URN =		{urn:nbn:de:0030-drops-129624},
  doi =		{10.4230/LIPIcs.DNA.2020.9},
  annote =	{Keywords: computer-aided design, structural DNA nanotechnology, DNA origami}
}

Document

DOI: 10.4230/LIPIcs.ISAAC.2018.34

On Polynomial Time Constructions of Minimum Height Decision Tree

Authors: Nader H. Bshouty and Waseem Makhoul

Published in: LIPIcs, Volume 123, 29th International Symposium on Algorithms and Computation (ISAAC 2018)

Abstract

A decision tree T in B_m:={0,1}^m is a binary tree where each of its internal nodes is labeled with an integer in [m]={1,2,...,m}, each leaf is labeled with an assignment a in B_m and each internal node has two outgoing edges that are labeled with 0 and 1, respectively. Let A subset {0,1}^m. We say that T is a decision tree for A if (1) For every a in A there is one leaf of T that is labeled with a. (2) For every path from the root to a leaf with internal nodes labeled with i_1,i_2,...,i_k in[m], a leaf labeled with a in A and edges labeled with xi_{i_1},...,xi_{i_k}in {0,1}, a is the only element in A that satisfies a_{i_j}=xi_{i_j} for all j=1,...,k. Our goal is to write a polynomial time (in n:=|A| and m) algorithm that for an input A subseteq B_m outputs a decision tree for A of minimum depth. This problem has many applications that include, to name a few, computer vision, group testing, exact learning from membership queries and game theory. Arkin et al. and Moshkov [Esther M. Arkin et al., 1998; Mikhail Ju. Moshkov, 2004] gave a polynomial time (ln |A|)- approximation algorithm (for the depth). The result of Dinur and Steurer [Irit Dinur and David Steurer, 2014] for set cover implies that this problem cannot be approximated with ratio (1-o(1))* ln |A|, unless P=NP. Moshkov studied in [Mikhail Ju. Moshkov, 2004; Mikhail Ju. Moshkov, 1982; Mikhail Ju. Moshkov, 1982] the combinatorial measure of extended teaching dimension of A, ETD(A). He showed that ETD(A) is a lower bound for the depth of the decision tree for A and then gave an exponential time ETD(A)/log(ETD(A))-approximation algorithm and a polynomial time 2(ln 2)ETD(A)-approximation algorithm. In this paper we further study the ETD(A) measure and a new combinatorial measure, DEN(A), that we call the density of the set A. We show that DEN(A) <=ETD(A)+1. We then give two results. The first result is that the lower bound ETD(A) of Moshkov for the depth of the decision tree for A is greater than the bounds that are obtained by the classical technique used in the literature. The second result is a polynomial time (ln 2)DEN(A)-approximation (and therefore (ln 2)ETD(A)-approximation) algorithm for the depth of the decision tree of A. We then apply the above results to learning the class of disjunctions of predicates from membership queries [Nader H. Bshouty et al., 2017]. We show that the ETD of this class is bounded from above by the degree d of its Hasse diagram. We then show that Moshkov algorithm can be run in polynomial time and is (d/log d)-approximation algorithm. This gives optimal algorithms when the degree is constant. For example, learning axis parallel rays over constant dimension space.

Cite as

Nader H. Bshouty and Waseem Makhoul. On Polynomial Time Constructions of Minimum Height Decision Tree. In 29th International Symposium on Algorithms and Computation (ISAAC 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 123, pp. 34:1-34:12, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)

Copy BibTex To Clipboard

@InProceedings{h.bshouty_et_al:LIPIcs.ISAAC.2018.34,
  author =	{H. Bshouty, Nader and Makhoul, Waseem},
  title =	{{On Polynomial Time Constructions of Minimum Height Decision Tree}},
  booktitle =	{29th International Symposium on Algorithms and Computation (ISAAC 2018)},
  pages =	{34:1--34:12},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-094-1},
  ISSN =	{1868-8969},
  year =	{2018},
  volume =	{123},
  editor =	{Hsu, Wen-Lian and Lee, Der-Tsai and Liao, Chung-Shou},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ISAAC.2018.34},
  URN =		{urn:nbn:de:0030-drops-99824},
  doi =		{10.4230/LIPIcs.ISAAC.2018.34},
  annote =	{Keywords: Decision Tree, Minimal Depth, Approximation algorithms}
}

5 Search Results for "Lee, David J."

Telescoping Filter: A Practical Adaptive Filter

Abstract

Cite as

Simple Heuristics Yield Provable Algorithms for Masked Low-Rank Approximation

Abstract

Cite as

Erasure-Resilient Sublinear-Time Graph Algorithms

Abstract

Cite as

scadnano: A Browser-Based, Scriptable Tool for Designing DNA Nanostructures

Abstract

Cite as

On Polynomial Time Constructions of Minimum Height Decision Tree

Abstract

Cite as

Thanks for your feedback!

Could not send message