Document

APPROX

**Published in:** LIPIcs, Volume 275, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2023)

In this work we consider the diversity maximization problem, where given a data set X of n elements, and a parameter k, the goal is to pick a subset of X of size k maximizing a certain diversity measure. Chandra and Halldórsson [Barun Chandra and Magnús M. Halldórsson, 2001] defined a variety of diversity measures based on pairwise distances between the points. A constant factor approximation algorithm was known for all those diversity measures except "remote-matching", where only an O(log k) approximation was known. In this work we present an O(1) approximation for this remaining notion. Further, we consider these notions from the perpective of composable coresets. Indyk et al. [Piotr Indyk et al., 2014] provided composable coresets with a constant factor approximation for all but "remote-pseudoforest" and "remote-matching", which again they only obtained a O(log k) approximation. Here we also close the gap up to constants and present a constant factor composable coreset algorithm for these two notions. For remote-matching, our coreset has size only O(k), and for remote-pseudoforest, our coreset has size O(k^{1+ε}) for any ε > 0, for an O(1/ε)-approximate coreset.

Sepideh Mahabadi and Shyam Narayanan. Improved Diversity Maximization Algorithms for Matching and Pseudoforest. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 275, pp. 25:1-25:22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@InProceedings{mahabadi_et_al:LIPIcs.APPROX/RANDOM.2023.25, author = {Mahabadi, Sepideh and Narayanan, Shyam}, title = {{Improved Diversity Maximization Algorithms for Matching and Pseudoforest}}, booktitle = {Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2023)}, pages = {25:1--25:22}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-296-9}, ISSN = {1868-8969}, year = {2023}, volume = {275}, editor = {Megow, Nicole and Smith, Adam}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX/RANDOM.2023.25}, URN = {urn:nbn:de:0030-drops-188503}, doi = {10.4230/LIPIcs.APPROX/RANDOM.2023.25}, annote = {Keywords: diversity maximization, approximation algorithms, composable coresets} }

Document

RANDOM

**Published in:** LIPIcs, Volume 275, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2023)

In classical statistics and distribution testing, it is often assumed that elements can be sampled exactly from some distribution 𝒫, and that when an element x is sampled, the probability 𝒫(x) of sampling x is also known. In this setting, recent work in distribution testing has shown that many algorithms are robust in the sense that they still produce correct output if the elements are drawn from any distribution 𝒬 that is sufficiently close to 𝒫. This phenomenon raises interesting questions: under what conditions is a "noisy" distribution 𝒬 sufficient, and what is the algorithmic cost of coping with this noise?
In this paper, we investigate these questions for the problem of estimating the sum of a multiset of N real values x_1, …, x_N. This problem is well-studied in the statistical literature in the case 𝒫 = 𝒬, where the Hansen-Hurwitz estimator [Annals of Mathematical Statistics, 1943] is frequently used. We assume that for some (known) distribution 𝒫, values are sampled from a distribution 𝒬 that is pointwise close to 𝒫. That is, there is a parameter γ < 1 such that for all x_i, (1 - γ) 𝒫(i) ≤ 𝒬(i) ≤ (1 + γ) 𝒫(i). For every positive integer k we define an estimator ζ_k for μ = ∑_i x_i whose bias is proportional to γ^k (where our ζ₁ reduces to the classical Hansen-Hurwitz estimator). As a special case, we show that if 𝒬 is pointwise γ-close to uniform and all x_i ∈ {0, 1}, for any ε > 0, we can estimate μ to within additive error ε N using m = Θ(N^{1-1/k}/ε^{2/k}) samples, where k = ⌈lg ε/lg γ⌉. We then show that this sample complexity is essentially optimal. Interestingly, our upper and lower bounds show that the sample complexity need not vary uniformly with the desired error parameter ε: for some values of ε, perturbations in its value have no asymptotic effect on the sample complexity, while for other values, any decrease in its value results in an asymptotically larger sample complexity.

Talya Eden, Jakob Bæk Tejs Houen, Shyam Narayanan, Will Rosenbaum, and Jakub Tětek. Bias Reduction for Sum Estimation. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 275, pp. 62:1-62:21, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@InProceedings{eden_et_al:LIPIcs.APPROX/RANDOM.2023.62, author = {Eden, Talya and Houen, Jakob B{\ae}k Tejs and Narayanan, Shyam and Rosenbaum, Will and T\v{e}tek, Jakub}, title = {{Bias Reduction for Sum Estimation}}, booktitle = {Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2023)}, pages = {62:1--62:21}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-296-9}, ISSN = {1868-8969}, year = {2023}, volume = {275}, editor = {Megow, Nicole and Smith, Adam}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX/RANDOM.2023.62}, URN = {urn:nbn:de:0030-drops-188872}, doi = {10.4230/LIPIcs.APPROX/RANDOM.2023.62}, annote = {Keywords: bias reduction, sum estimation, sublinear time algorithms, sample complexity} }

Document

Track A: Algorithms, Complexity and Games

**Published in:** LIPIcs, Volume 229, 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022)

The p-processor cup game is a classic and widely studied scheduling problem that captures the setting in which a p-processor machine must assign tasks to processors over time in order to ensure that no individual task ever falls too far behind. The problem is formalized as a multi-round game in which two players, a filler (who assigns work to tasks) and an emptier (who schedules tasks) compete. The emptier’s goal is to minimize backlog, which is the maximum amount of outstanding work for any task.
Recently, Kuszmaul and Westover (ITCS, 2021) proposed the variable-processor cup game, which considers the same problem, except that the amount of resources available to the players (i.e., the number p of processors) fluctuates between rounds of the game. They showed that this seemingly small modification fundamentally changes the dynamics of the game: whereas the optimal backlog in the fixed p-processor game is Θ(log n), independent of p, the optimal backlog in the variable-processor game is Θ(n). The latter result was only known to apply to games with exponentially many rounds, however, and it has remained an open question what the optimal tradeoff between time and backlog is for shorter games.
This paper establishes a tight trade-off curve between time and backlog in the variable-processor cup game. We show that, for a game consisting of t rounds, the optimal backlog is Θ (b (t)) where
b(t) =
t (if t ≤ log n)
t^{1/3} log^{2/3} ({n^3}/t + 1) (if log n < t ≤ n^3)
n (if n ^ 3 < t).
An important consequence is that the optimal backlog is Θ(n) if and only if t ≥ Ω(n³). Our techniques also allow for us to resolve several other open questions concerning how the variable-processor cup game behaves in beyond-worst-case-analysis settings.

William Kuszmaul and Shyam Narayanan. Optimal Time-Backlog Tradeoffs for the Variable-Processor Cup Game. In 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 229, pp. 85:1-85:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)

Copy BibTex To Clipboard

@InProceedings{kuszmaul_et_al:LIPIcs.ICALP.2022.85, author = {Kuszmaul, William and Narayanan, Shyam}, title = {{Optimal Time-Backlog Tradeoffs for the Variable-Processor Cup Game}}, booktitle = {49th International Colloquium on Automata, Languages, and Programming (ICALP 2022)}, pages = {85:1--85:20}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-235-8}, ISSN = {1868-8969}, year = {2022}, volume = {229}, editor = {Boja\'{n}czyk, Miko{\l}aj and Merelli, Emanuela and Woodruff, David P.}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2022.85}, URN = {urn:nbn:de:0030-drops-164263}, doi = {10.4230/LIPIcs.ICALP.2022.85}, annote = {Keywords: Cup Games, Potential Functions, Greedy} }

Document

**Published in:** LIPIcs, Volume 185, 12th Innovations in Theoretical Computer Science Conference (ITCS 2021)

Trace reconstruction is the problem of learning an unknown string x from independent traces of x, where traces are generated by independently deleting each bit of x with some deletion probability q. In this paper, we initiate the study of Circular trace reconstruction, where the unknown string x is circular and traces are now rotated by a random cyclic shift. Trace reconstruction is related to many computational biology problems studying DNA, which is a primary motivation for this problem as well, as many types of DNA are known to be circular.
Our main results are as follows. First, we prove that we can reconstruct arbitrary circular strings of length n using exp(Õ(n^{1/3})) traces for any constant deletion probability q, as long as n is prime or the product of two primes. For n of this form, this nearly matches what was the best known bound of exp(O(n^{1/3})) for standard trace reconstruction when this paper was initially released. We note, however, that Chase very recently improved the standard trace reconstruction bound to exp(Õ(n^{1/5})). Next, we prove that we can reconstruct random circular strings with high probability using n^O(1) traces for any constant deletion probability q. Finally, we prove a lower bound of Ω̃(n³) traces for arbitrary circular strings, which is greater than the best known lower bound of Ω̃(n^{3/2}) in standard trace reconstruction.

Shyam Narayanan and Michael Ren. Circular Trace Reconstruction. In 12th Innovations in Theoretical Computer Science Conference (ITCS 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 185, pp. 18:1-18:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)

Copy BibTex To Clipboard

@InProceedings{narayanan_et_al:LIPIcs.ITCS.2021.18, author = {Narayanan, Shyam and Ren, Michael}, title = {{Circular Trace Reconstruction}}, booktitle = {12th Innovations in Theoretical Computer Science Conference (ITCS 2021)}, pages = {18:1--18:18}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-177-1}, ISSN = {1868-8969}, year = {2021}, volume = {185}, editor = {Lee, James R.}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2021.18}, URN = {urn:nbn:de:0030-drops-135573}, doi = {10.4230/LIPIcs.ITCS.2021.18}, annote = {Keywords: Trace Reconstruction, Deletion Channel, Cyclotomic Integers} }

Document

RANDOM

**Published in:** LIPIcs, Volume 145, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2019)

A family of problems that have been studied in the context of various streaming algorithms are generalizations of the fact that the expected maximum distance of a 4-wise independent random walk on a line over n steps is O(sqrt{n}). For small values of k, there exist k-wise independent random walks that can be stored in much less space than storing n random bits, so these properties are often useful for lowering space bounds. In this paper, we show that for all of these examples, 4-wise independence is required by demonstrating a pairwise independent random walk with steps uniform in +/- 1 and expected maximum distance Omega(sqrt{n} lg n) from the origin. We also show that this bound is tight for the first and second moment, i.e. the expected maximum square distance of a 2-wise independent random walk is always O(n lg^2 n). Also, for any even k >= 4, we show that the kth moment of the maximum distance of any k-wise independent random walk is O(n^{k/2}). The previous two results generalize to random walks tracking insertion-only streams, and provide higher moment bounds than currently known. We also prove a generalization of Kolmogorov’s maximal inequality by showing an asymptotically equivalent statement that requires only 4-wise independent random variables with bounded second moments, which also generalizes a result of Błasiok.

Shyam Narayanan. Pairwise Independent Random Walks Can Be Slightly Unbounded. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 145, pp. 63:1-63:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)

Copy BibTex To Clipboard

@InProceedings{narayanan:LIPIcs.APPROX-RANDOM.2019.63, author = {Narayanan, Shyam}, title = {{Pairwise Independent Random Walks Can Be Slightly Unbounded}}, booktitle = {Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2019)}, pages = {63:1--63:19}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-125-2}, ISSN = {1868-8969}, year = {2019}, volume = {145}, editor = {Achlioptas, Dimitris and V\'{e}gh, L\'{a}szl\'{o} A.}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX-RANDOM.2019.63}, URN = {urn:nbn:de:0030-drops-112787}, doi = {10.4230/LIPIcs.APPROX-RANDOM.2019.63}, annote = {Keywords: k-wise Independence, Random Walks, Moments, Chaining} }

Document

**Published in:** LIPIcs, Volume 116, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2018)

The 1-center clustering with outliers problem asks about identifying a prototypical robust statistic that approximates the location of a cluster of points. Given some constant 0 < alpha < 1 and n points such that alpha n of them are in some (unknown) ball of radius r, the goal is to compute a ball of radius O(r) that also contains alpha n points. This problem can be formulated with the points in a normed vector space such as R^d or in a general metric space.
The problem has a simple randomized solution: a randomly selected point is a correct solution with constant probability, and its correctness can be verified in linear time. However, the deterministic complexity of this problem was not known. In this paper, for any L^p vector space, we show an O(nd)-time solution with a ball of radius O(r) for a fixed alpha > 1/2, and for any normed vector space, we show an O(nd)-time solution with a ball of radius O(r) when alpha > 1/2 as well as an O(nd log^{(k)}(n))-time solution with a ball of radius O(r) for all alpha > 0, k in N, where log^{(k)}(n) represents the kth iterated logarithm, assuming distance computation and vector space operations take O(d) time. For an arbitrary metric space, we show for any C in N an O(n^{1+1/C})-time solution that finds a ball of radius 2Cr, assuming distance computation between any pair of points takes O(1)-time, and show that for any alpha, C, an O(n^{1+1/C})-time solution that finds a ball of radius ((2C-3)(1-alpha)-1)r cannot exist.

Shyam Narayanan. Deterministic O(1)-Approximation Algorithms to 1-Center Clustering with Outliers. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 116, pp. 21:1-21:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)

Copy BibTex To Clipboard

@InProceedings{narayanan:LIPIcs.APPROX-RANDOM.2018.21, author = {Narayanan, Shyam}, title = {{Deterministic O(1)-Approximation Algorithms to 1-Center Clustering with Outliers}}, booktitle = {Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2018)}, pages = {21:1--21:19}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-085-9}, ISSN = {1868-8969}, year = {2018}, volume = {116}, editor = {Blais, Eric and Jansen, Klaus and D. P. Rolim, Jos\'{e} and Steurer, David}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX-RANDOM.2018.21}, URN = {urn:nbn:de:0030-drops-94253}, doi = {10.4230/LIPIcs.APPROX-RANDOM.2018.21}, annote = {Keywords: Deterministic, Approximation Algorithm, Cluster, Statistic} }