DROPS

Document

DOI: 10.4230/LIPIcs.ESA.2024.5

Online Sorting and Online TSP: Randomized, Stochastic, and High-Dimensional

Authors: Mikkel Abrahamsen, Ioana O. Bercea, Lorenzo Beretta, Jonas Klausen, and László Kozma

Published in: LIPIcs, Volume 308, 32nd Annual European Symposium on Algorithms (ESA 2024)

Abstract

In the online sorting problem, n items are revealed one by one and have to be placed (immediately and irrevocably) into empty cells of a size-n array. The goal is to minimize the sum of absolute differences between items in consecutive cells. This natural problem was recently introduced by Aamand, Abrahamsen, Beretta, and Kleist (SODA 2023) as a tool in their study of online geometric packing problems. They showed that when the items are reals from the interval [0,1] a competitive ratio of O(√n) is achievable, and no deterministic algorithm can improve this ratio asymptotically. In this paper, we extend and generalize the study of online sorting in three directions: - randomized: we settle the open question of Aamand et al. by showing that the O(√n) competitive ratio for the online sorting of reals cannot be improved even with the use of randomness; - stochastic: we consider inputs consisting of n samples drawn uniformly at random from an interval, and give an algorithm with an improved competitive ratio of Õ(n^{1/4}). The result reveals connections between online sorting and the design of efficient hash tables; - high-dimensional: we show that Õ(√n)-competitive online sorting is possible even for items from ℝ^d, for arbitrary fixed d, in an adversarial model. This can be viewed as an online variant of the classical TSP problem where tasks (cities to visit) are revealed one by one and the salesperson assigns each task (immediately and irrevocably) to its timeslot. Along the way, we also show a tight O(log n)-competitiveness result for uniform metrics, i.e., where items are of different types and the goal is to order them so as to minimize the number of switches between consecutive items of different types.

Cite as

Mikkel Abrahamsen, Ioana O. Bercea, Lorenzo Beretta, Jonas Klausen, and László Kozma. Online Sorting and Online TSP: Randomized, Stochastic, and High-Dimensional. In 32nd Annual European Symposium on Algorithms (ESA 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 308, pp. 5:1-5:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{abrahamsen_et_al:LIPIcs.ESA.2024.5,
  author =	{Abrahamsen, Mikkel and Bercea, Ioana O. and Beretta, Lorenzo and Klausen, Jonas and Kozma, L\'{a}szl\'{o}},
  title =	{{Online Sorting and Online TSP: Randomized, Stochastic, and High-Dimensional}},
  booktitle =	{32nd Annual European Symposium on Algorithms (ESA 2024)},
  pages =	{5:1--5:15},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-338-6},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{308},
  editor =	{Chan, Timothy and Fischer, Johannes and Iacono, John and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2024.5},
  URN =		{urn:nbn:de:0030-drops-210766},
  doi =		{10.4230/LIPIcs.ESA.2024.5},
  annote =	{Keywords: sorting, online algorithm, TSP}
}

Document

DOI: 10.4230/LIPIcs.SWAT.2024.9

Daisy Bloom Filters

Authors: Ioana O. Bercea, Jakob Bæk Tejs Houen, and Rasmus Pagh

Published in: LIPIcs, Volume 294, 19th Scandinavian Symposium and Workshops on Algorithm Theory (SWAT 2024)

Abstract

A filter is a widely used data structure for storing an approximation of a given set S of elements from some universe 𝒰 (a countable set). It represents a superset S' ⊇ S that is "close to S" in the sense that for x ∉ S, the probability that x ∈ S' is bounded by some ε > 0. The advantage of using a Bloom filter, when some false positives are acceptable, is that the space usage becomes smaller than what is required to store S exactly. Though filters are well-understood from a worst-case perspective, it is clear that state-of-the-art constructions may not be close to optimal for particular distributions of data and queries. Suppose, for instance, that some elements are in S with probability close to 1. Then it would make sense to always include them in S', saving space by not having to represent these elements in the filter. Questions like this have been raised in the context of Weighted Bloom filters (Bruck, Gao and Jiang, ISIT 2006) and Bloom filter implementations that make use of access to learned components (Vaidya, Knorr, Mitzenmacher, and Krask, ICLR 2021). In this paper, we present a lower bound for the expected space that such a filter requires. We also show that the lower bound is asymptotically tight by exhibiting a filter construction that executes queries and insertions in worst-case constant time, and has a false positive rate at most ε with high probability over input sets drawn from a product distribution. We also present a Bloom filter alternative, which we call the Daisy Bloom filter, that executes operations faster and uses significantly less space than the standard Bloom filter.

Cite as

Ioana O. Bercea, Jakob Bæk Tejs Houen, and Rasmus Pagh. Daisy Bloom Filters. In 19th Scandinavian Symposium and Workshops on Algorithm Theory (SWAT 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 294, pp. 9:1-9:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{bercea_et_al:LIPIcs.SWAT.2024.9,
  author =	{Bercea, Ioana O. and Houen, Jakob B{\ae}k Tejs and Pagh, Rasmus},
  title =	{{Daisy Bloom Filters}},
  booktitle =	{19th Scandinavian Symposium and Workshops on Algorithm Theory (SWAT 2024)},
  pages =	{9:1--9:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-318-8},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{294},
  editor =	{Bodlaender, Hans L.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SWAT.2024.9},
  URN =		{urn:nbn:de:0030-drops-200491},
  doi =		{10.4230/LIPIcs.SWAT.2024.9},
  annote =	{Keywords: Bloom filters, input distribution, learned data structures}
}

Document

DOI: 10.4230/LIPIcs.SWAT.2020.11

A Dynamic Space-Efficient Filter with Constant Time Operations

Authors: Ioana O. Bercea and Guy Even

Published in: LIPIcs, Volume 162, 17th Scandinavian Symposium and Workshops on Algorithm Theory (SWAT 2020)

Abstract

A dynamic dictionary is a data structure that maintains sets of cardinality at most n from a given universe and supports insertions, deletions, and membership queries. A filter approximates membership queries with a one-sided error that occurs with probability at most ε. The goal is to obtain dynamic filters that are space-efficient (the space is 1+o(1) times the information-theoretic lower bound) and support all operations in constant time with high probability. One approach to designing filters is to reduce to the retrieval problem. When the size of the universe is polynomial in n, this approach yields a space-efficient dynamic filter as long as the error parameter ε satisfies log(1/ε) = ω(log log n). For the case that log(1/ε) = O(log log n), we present the first space-efficient dynamic filter with constant time operations in the worst case (whp). In contrast, the space-efficient dynamic filter of Pagh et al. [Anna Pagh et al., 2005] supports insertions and deletions in amortized expected constant time. Our approach employs the classic reduction of Carter et al. [Carter et al., 1978] on a new type of dictionary construction that supports random multisets.

Cite as

Ioana O. Bercea and Guy Even. A Dynamic Space-Efficient Filter with Constant Time Operations. In 17th Scandinavian Symposium and Workshops on Algorithm Theory (SWAT 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 162, pp. 11:1-11:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)

Copy BibTex To Clipboard

@InProceedings{bercea_et_al:LIPIcs.SWAT.2020.11,
  author =	{Bercea, Ioana O. and Even, Guy},
  title =	{{A Dynamic Space-Efficient Filter with Constant Time Operations}},
  booktitle =	{17th Scandinavian Symposium and Workshops on Algorithm Theory (SWAT 2020)},
  pages =	{11:1--11:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-150-4},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{162},
  editor =	{Albers, Susanne},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SWAT.2020.11},
  URN =		{urn:nbn:de:0030-drops-122582},
  doi =		{10.4230/LIPIcs.SWAT.2020.11},
  annote =	{Keywords: Data Structures}
}

Document

APPROX

DOI: 10.4230/LIPIcs.APPROX-RANDOM.2019.18

On the Cost of Essentially Fair Clusterings

Authors: Ioana O. Bercea, Martin Groß, Samir Khuller, Aounon Kumar, Clemens Rösner, Daniel R. Schmidt, and Melanie Schmidt

Published in: LIPIcs, Volume 145, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2019)

Abstract

Clustering is a fundamental tool in data mining and machine learning. It partitions points into groups (clusters) and may be used to make decisions for each point based on its group. However, this process may harm protected (minority) classes if the clustering algorithm does not adequately represent them in desirable clusters - especially if the data is already biased. At NIPS 2017, Chierichetti et al. [Flavio Chierichetti et al., 2017] proposed a model for fair clustering requiring the representation in each cluster to (approximately) preserve the global fraction of each protected class. Restricting to two protected classes, they developed both a 4-approximation for the fair k-center problem and a O(t)-approximation for the fair k-median problem, where t is a parameter for the fairness model. For multiple protected classes, the best known result is a 14-approximation for fair k-center [Clemens Rösner and Melanie Schmidt, 2018]. We extend and improve the known results. Firstly, we give a 5-approximation for the fair k-center problem with multiple protected classes. Secondly, we propose a relaxed fairness notion under which we can give bicriteria constant-factor approximations for all of the classical clustering objectives k-center, k-supplier, k-median, k-means and facility location. The latter approximations are achieved by a framework that takes an arbitrary existing unfair (integral) solution and a fair (fractional) LP solution and combines them into an essentially fair clustering with a weakly supervised rounding scheme. In this way, a fair clustering can be established belatedly, in a situation where the centers are already fixed.

Cite as

Ioana O. Bercea, Martin Groß, Samir Khuller, Aounon Kumar, Clemens Rösner, Daniel R. Schmidt, and Melanie Schmidt. On the Cost of Essentially Fair Clusterings. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 145, pp. 18:1-18:22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)

Copy BibTex To Clipboard

@InProceedings{bercea_et_al:LIPIcs.APPROX-RANDOM.2019.18,
  author =	{Bercea, Ioana O. and Gro{\ss}, Martin and Khuller, Samir and Kumar, Aounon and R\"{o}sner, Clemens and Schmidt, Daniel R. and Schmidt, Melanie},
  title =	{{On the Cost of Essentially Fair Clusterings}},
  booktitle =	{Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2019)},
  pages =	{18:1--18:22},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-125-2},
  ISSN =	{1868-8969},
  year =	{2019},
  volume =	{145},
  editor =	{Achlioptas, Dimitris and V\'{e}gh, L\'{a}szl\'{o} A.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX-RANDOM.2019.18},
  URN =		{urn:nbn:de:0030-drops-112337},
  doi =		{10.4230/LIPIcs.APPROX-RANDOM.2019.18},
  annote =	{Keywords: approximation, clustering, fairness, LP rounding}
}

@InProceedings{bercea_et_al:LIPIcs.APPROX-RANDOM.2019.18,
  author =	{Bercea, Ioana O. and Gro{\ss}, Martin and Khuller, Samir and Kumar, Aounon and R\"{o}sner, Clemens and Schmidt, Daniel R. and Schmidt, Melanie},
  title =	{{On the Cost of Essentially Fair Clusterings}},
  booktitle =	{Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2019)},
  pages =	{18:1--18:22},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-125-2},
  ISSN =	{1868-8969},
  year =	{2019},
  volume =	{145},
  editor =	{Achlioptas, Dimitris and V\'{e}gh, L\'{a}szl\'{o} A.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX-RANDOM.2019.18},
  URN =		{urn:nbn:de:0030-drops-112337},
  doi =		{10.4230/LIPIcs.APPROX-RANDOM.2019.18},
  annote =	{Keywords: approximation, clustering, fairness, LP rounding}
}

Search Results

Documents authored by Bercea, Ioana O.

Online Sorting and Online TSP: Randomized, Stochastic, and High-Dimensional

Abstract

Cite as

Daisy Bloom Filters

Abstract

Cite as

A Dynamic Space-Efficient Filter with Constant Time Operations

Abstract

Cite as

On the Cost of Essentially Fair Clusterings

Abstract

Cite as

Thanks for your feedback!

Could not send message