eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2022-09-15
245
1
1064
10.4230/LIPIcs.APPROX/RANDOM.2022
article
LIPIcs, Volume 245, APPROX/RANDOM 2022, Complete Volume
Chakrabarti, Amit
1
https://orcid.org/0000-0003-3633-9180
Swamy, Chaitanya
2
https://orcid.org/0000-0003-1108-7941
Dartmouth College, Hanover, NH, USA
University of Waterloo, Canada
LIPIcs, Volume 245, APPROX/RANDOM 2022, Complete Volume
https://drops.dagstuhl.de/storage/00lipics/lipics-vol245-approx-random2022/LIPIcs.APPROX-RANDOM.2022/LIPIcs.APPROX-RANDOM.2022.pdf
LIPIcs, Volume 245, APPROX/RANDOM 2022, Complete Volume
eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2022-09-15
245
0:i
0:xx
10.4230/LIPIcs.APPROX/RANDOM.2022.0
article
Front Matter, Table of Contents, Preface, Conference Organization
Chakrabarti, Amit
1
https://orcid.org/0000-0003-3633-9180
Swamy, Chaitanya
2
https://orcid.org/0000-0003-1108-7941
Dartmouth College, Hanover, NH, USA
University of Waterloo, Canada
Front Matter, Table of Contents, Preface, Conference Organization
https://drops.dagstuhl.de/storage/00lipics/lipics-vol245-approx-random2022/LIPIcs.APPROX-RANDOM.2022.0/LIPIcs.APPROX-RANDOM.2022.0.pdf
Front Matter
Table of Contents
Preface
Conference Organization
eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2022-09-15
245
1:1
1:22
10.4230/LIPIcs.APPROX/RANDOM.2022.1
article
A Unified Approach to Discrepancy Minimization
Bansal, Nikhil
1
Laddha, Aditi
2
Vempala, Santosh
2
University of Michigan, Ann Arbor, MI, USA
Georgia Tech, Atlanta, GA, USA
We study a unified approach and algorithm for constructive discrepancy minimization based on a stochastic process. By varying the parameters of the process, one can recover various state-of-the-art results. We demonstrate the flexibility of the method by deriving a discrepancy bound for smoothed instances, which interpolates between known bounds for worst-case and random instances.
https://drops.dagstuhl.de/storage/00lipics/lipics-vol245-approx-random2022/LIPIcs.APPROX-RANDOM.2022.1/LIPIcs.APPROX-RANDOM.2022.1.pdf
Discrepancy theory
smoothed analysis
eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2022-09-15
245
2:1
2:21
10.4230/LIPIcs.APPROX/RANDOM.2022.2
article
Fourier Growth of Regular Branching Programs
Lee, Chin Ho
1
Pyne, Edward
1
Vadhan, Salil
1
Harvard University, Cambridge, MA, USA
We analyze the Fourier growth, i.e. the L₁ Fourier weight at level k (denoted L_{1,k}), of read-once regular branching programs. We prove that every read-once regular branching program B of width w ∈ [1,∞] with s accepting states on n-bit inputs must have its L_{1,k} bounded by min{Pr[B(U_n) = 1](w-1)^k, s ⋅ O((n log n)/k)^{(k-1)/2}}. For any constant k, our result is tight up to constant factors for the AND function on w-1 bits, and is tight up to polylogarithmic factors for unbounded width programs. In particular, for k = 1 we have L_{1,1}(B) ≤ s, with no dependence on the width w of the program.
Our result gives new bounds on the coin problem and new pseudorandom generators (PRGs). Furthermore, we obtain an explicit generator for unordered permutation branching programs of unbounded width with a constant factor stretch, where no PRG was previously known.
Applying a composition theorem of Błasiok, Ivanov, Jin, Lee, Servedio and Viola (RANDOM 2021), we extend our results to "generalized group products," a generalization of modular sums and product tests.
https://drops.dagstuhl.de/storage/00lipics/lipics-vol245-approx-random2022/LIPIcs.APPROX-RANDOM.2022.2/LIPIcs.APPROX-RANDOM.2022.2.pdf
pseudorandomness
fourier analysis
eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2022-09-15
245
3:1
3:17
10.4230/LIPIcs.APPROX/RANDOM.2022.3
article
Double Balanced Sets in High Dimensional Expanders
Kaufman, Tali
1
Mass, David
1
Department of Computer Science, Bar-Ilan University, Ramat-Gan, Israel
Recent works have shown that expansion of pseudorandom sets is of great importance. However, all current works on pseudorandom sets are limited only to product (or approximate product) spaces, where Fourier Analysis methods could be applied. In this work we ask the natural question whether pseudorandom sets are relevant in domains where Fourier Analysis methods cannot be applied, e.g., one-sided local spectral expanders.
We take the first step in the path of answering this question. We put forward a new definition for pseudorandom sets, which we call "double balanced sets". We demonstrate the strength of our new definition by showing that small double balanced sets in one-sided local spectral expanders have very strong expansion properties, such as unique-neighbor-like expansion. We further show that cohomologies in cosystolic expanders are double balanced, and use the newly derived strong expansion properties of double balanced sets in order to obtain an exponential improvement over the current state of the art lower bound on their minimal distance.
https://drops.dagstuhl.de/storage/00lipics/lipics-vol245-approx-random2022/LIPIcs.APPROX-RANDOM.2022.3/LIPIcs.APPROX-RANDOM.2022.3.pdf
High dimensional expanders
Double balanced sets
Pseudorandom functions
eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2022-09-15
245
4:1
4:18
10.4230/LIPIcs.APPROX/RANDOM.2022.4
article
Fast and Perfect Sampling of Subgraphs and Polymer Systems
Blanca, Antonio
1
Cannon, Sarah
2
Perkins, Will
3
Department of Computer Science and Engineering, Pennsylvania State University, University Park, PA, USA
Department of Mathematical Sciences, Claremont McKenna College, CA, USA
Department of Computer Science, Georgia Institute of Technology, Atlanta, GA, USA
We give an efficient perfect sampling algorithm for weighted, connected induced subgraphs (or graphlets) of rooted, bounded degree graphs. Our algorithm utilizes a vertex-percolation process with a carefully chosen rejection filter and works under a percolation subcriticality condition. We show that this condition is optimal in the sense that the task of (approximately) sampling weighted rooted graphlets becomes impossible in finite expected time for infinite graphs and intractable for finite graphs when the condition does not hold. We apply our sampling algorithm as a subroutine to give near linear-time perfect sampling algorithms for polymer models and weighted non-rooted graphlets in finite graphs, two widely studied yet very different problems. This new perfect sampling algorithm for polymer models gives improved sampling algorithms for spin systems at low temperatures on expander graphs and unbalanced bipartite graphs, among other applications.
https://drops.dagstuhl.de/storage/00lipics/lipics-vol245-approx-random2022/LIPIcs.APPROX-RANDOM.2022.4/LIPIcs.APPROX-RANDOM.2022.4.pdf
Random Sampling
perfect sampling
graphlets
polymer models
spin systems
percolation
eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2022-09-15
245
5:1
5:10
10.4230/LIPIcs.APPROX/RANDOM.2022.5
article
High Dimensional Expansion Implies Amplified Local Testability
Kaufman, Tali
1
Oppenheim, Izhar
2
Department of Computer Science, Bar-Ilan University, Ramat Gan, Israel
Department of Mathematics, Ben-Gurion University of the Negev, Be'er-Sheva, Israel
In this work, we define a notion of local testability of codes that is strictly stronger than the basic one (studied e.g., by recent works on high rate LTCs), and we term it amplified local testability. Amplified local testability is a notion close to the result of optimal testing for Reed-Muller codes achieved by Bhattacharyya et al.
We present a scheme to get amplified locally testable codes from high dimensional expanders. We show that single orbit Affine invariant codes, and in particular Reed-Muller codes, can be described via our scheme, and hence are amplified locally testable. This gives the strongest currently known testability result of single orbit affine invariant codes, strengthening the celebrated result of Kaufman and Sudan.
https://drops.dagstuhl.de/storage/00lipics/lipics-vol245-approx-random2022/LIPIcs.APPROX-RANDOM.2022.5/LIPIcs.APPROX-RANDOM.2022.5.pdf
Locally testable codes
High dimensional expanders
Amplified testing
eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2022-09-15
245
6:1
6:17
10.4230/LIPIcs.APPROX/RANDOM.2022.6
article
Polynomial Bounds on Parallel Repetition for All 3-Player Games with Binary Inputs
Girish, Uma
1
Mittal, Kunal
1
Raz, Ran
1
Zhan, Wei
1
Princeton University, NJ, USA
We prove that for every 3-player (3-prover) game G with value less than one, whose query distribution has the support S = {(1,0,0), (0,1,0), (0,0,1)} of Hamming weight one vectors, the value of the n-fold parallel repetition G^{⊗n} decays polynomially fast to zero; that is, there is a constant c = c(G) > 0 such that the value of the game G^{⊗n} is at most n^{-c}.
Following the recent work of Girish, Holmgren, Mittal, Raz and Zhan (STOC 2022), our result is the missing piece that implies a similar bound for a much more general class of multiplayer games: For every 3-player game G over binary questions and arbitrary answer lengths, with value less than 1, there is a constant c = c(G) > 0 such that the value of the game G^{⊗n} is at most n^{-c}.
Our proof technique is new and requires many new ideas. For example, we make use of the Level-k inequalities from Boolean Fourier Analysis, which, to the best of our knowledge, have not been explored in this context prior to our work.
https://drops.dagstuhl.de/storage/00lipics/lipics-vol245-approx-random2022/LIPIcs.APPROX-RANDOM.2022.6/LIPIcs.APPROX-RANDOM.2022.6.pdf
Parallel repetition
Multi-prover games
Fourier analysis
eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2022-09-15
245
7:1
7:17
10.4230/LIPIcs.APPROX/RANDOM.2022.7
article
Local Treewidth of Random and Noisy Graphs with Applications to Stopping Contagion in Networks
Mehta, Hermish
1
Reichman, Daniel
2
Citadel Securities, Chicago, IL, USA
Department of Computer Science, Worcester Polytechnic Institute, MA, USA
We study the notion of local treewidth in sparse random graphs: the maximum treewidth over all k-vertex subgraphs of an n-vertex graph. When k is not too large, we give nearly tight bounds for this local treewidth parameter; we also derive nearly tight bounds for the local treewidth of noisy trees, trees where every non-edge is added independently with small probability. We apply our upper bounds on the local treewidth to obtain fixed parameter tractable algorithms (on random graphs and noisy trees) for edge-removal problems centered around containing a contagious process evolving over a network. In these problems, our main parameter of study is k, the number of initially "infected" vertices in the network. For the random graph models we consider and a certain range of parameters the running time of our algorithms on n-vertex graphs is 2^o(k) poly(n), improving upon the 2^Ω(k) poly(n) performance of the best-known algorithms designed for worst-case instances of these edge deletion problems.
https://drops.dagstuhl.de/storage/00lipics/lipics-vol245-approx-random2022/LIPIcs.APPROX-RANDOM.2022.7/LIPIcs.APPROX-RANDOM.2022.7.pdf
Graph Algorithms
Random Graphs
Data Structures and Algorithms
Discrete Mathematics
eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2022-09-15
245
8:1
8:17
10.4230/LIPIcs.APPROX/RANDOM.2022.8
article
Beyond Single-Deletion Correcting Codes: Substitutions and Transpositions
Gabrys, Ryan
1
https://orcid.org/0000-0002-9197-3371
Guruswami, Venkatesan
2
https://orcid.org/0000-0001-7926-3396
Ribeiro, João
3
https://orcid.org/0000-0002-9870-0501
Wu, Ke
3
https://orcid.org/0000-0002-2756-8750
ECE Department, University of California, San Diego, CA, USA
EECS Department, University of California, Berkeley, CA, USA
Computer Science Department, Carnegie Mellon University, Pittsburgh, PA, USA
We consider the problem of designing low-redundancy codes in settings where one must correct deletions in conjunction with substitutions or adjacent transpositions; a combination of errors that is usually observed in DNA-based data storage. One of the most basic versions of this problem was settled more than 50 years ago by Levenshtein, who proved that binary Varshamov-Tenengolts codes correct one arbitrary edit error, i.e., one deletion or one substitution, with nearly optimal redundancy. However, this approach fails to extend to many simple and natural variations of the binary single-edit error setting. In this work, we make progress on the code design problem above in three such variations:
- We construct linear-time encodable and decodable length-n non-binary codes correcting a single edit error with nearly optimal redundancy log n+O(log log n), providing an alternative simpler proof of a result by Cai, Chee, Gabrys, Kiah, and Nguyen (IEEE Trans. Inf. Theory 2021). This is achieved by employing what we call weighted VT sketches, a new notion that may be of independent interest.
- We show the existence of a binary code correcting one deletion or one adjacent transposition with nearly optimal redundancy log n+O(log log n).
- We construct linear-time encodable and list-decodable binary codes with list-size 2 for one deletion and one substitution with redundancy 4log n+O(log log n). This matches the existential bound up to an O(log log n) additive term.
https://drops.dagstuhl.de/storage/00lipics/lipics-vol245-approx-random2022/LIPIcs.APPROX-RANDOM.2022.8/LIPIcs.APPROX-RANDOM.2022.8.pdf
Synchronization errors
Optimal redundancy
Explicit codes
eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2022-09-15
245
9:1
9:14
10.4230/LIPIcs.APPROX/RANDOM.2022.9
article
Affine Extractors and AC0-Parity
Huang, Xuangui
1
Ivanov, Peter
1
Viola, Emanuele
1
Northeastern University, Boston, MA, USA
We study a simple and general template for constructing affine extractors by composing a linear transformation with resilient functions. Using this we show that good affine extractors can be computed by non-explicit circuits of various types, including AC0-Xor circuits: AC0 circuits with a layer of parity gates at the input. We also show that one-sided extractors can be computed by small DNF-Xor circuits, and separate these circuits from other well-studied classes. As a further motivation for studying DNF-Xor circuits we show that if they can approximate inner product then small AC0-Xor circuits can compute it exactly - a long-standing open problem.
https://drops.dagstuhl.de/storage/00lipics/lipics-vol245-approx-random2022/LIPIcs.APPROX-RANDOM.2022.9/LIPIcs.APPROX-RANDOM.2022.9.pdf
affine extractor
resilient function
constant-depth circuit
parity gate
inner product
eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2022-09-15
245
10:1
10:19
10.4230/LIPIcs.APPROX/RANDOM.2022.10
article
Hyperbolic Concentration, Anti-Concentration, and Discrepancy
Song, Zhao
1
Zhang, Ruizhe
2
Adobe Research, Seattle, WA, USA
The University of Texas at Austin, TX, USA
Chernoff bound is a fundamental tool in theoretical computer science. It has been extensively used in randomized algorithm design and stochastic type analysis. Discrepancy theory, which deals with finding a bi-coloring of a set system such that the coloring of each set is balanced, has a huge number of applications in approximation algorithms design. Chernoff bound [Che52] implies that a random bi-coloring of any set system with n sets and n elements will have discrepancy O(√{n log n}) with high probability, while the famous result by Spencer [Spe85] shows that there exists an O(√n) discrepancy solution.
The study of hyperbolic polynomials dates back to the early 20th century when used to solve PDEs by Gårding [Går59]. In recent years, more applications are found in control theory, optimization, real algebraic geometry, and so on. In particular, the breakthrough result by Marcus, Spielman, and Srivastava [MSS15] uses the theory of hyperbolic polynomials to prove the Kadison-Singer conjecture [KS59], which is closely related to discrepancy theory.
In this paper, we present a list of new results for hyperbolic polynomials:
- We show two nearly optimal hyperbolic Chernoff bounds: one for Rademacher sum of arbitrary vectors and another for random vectors in the hyperbolic cone.
- We show a hyperbolic anti-concentration bound.
- We generalize the hyperbolic Kadison-Singer theorem [Brä18] for vectors in sub-isotropic position, and prove a hyperbolic Spencer theorem for any constant hyperbolic rank vectors.
The classical matrix Chernoff and discrepancy results are based on determinant polynomial which is a special case of hyperbolic polynomials. To the best of our knowledge, this paper is the first work that shows either concentration or anti-concentration results for hyperbolic polynomials. We hope our findings provide more insights into hyperbolic and discrepancy theories.
https://drops.dagstuhl.de/storage/00lipics/lipics-vol245-approx-random2022/LIPIcs.APPROX-RANDOM.2022.10/LIPIcs.APPROX-RANDOM.2022.10.pdf
Hyperbolic polynomial
Chernoff bound
Concentration
Discrepancy theory
Anti-concentration
eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2022-09-15
245
11:1
11:19
10.4230/LIPIcs.APPROX/RANDOM.2022.11
article
Improved Local Testing for Multiplicity Codes
Karliner, Dan
1
Ta-Shma, Amnon
1
Department of Computer Science, Tel Aviv University, Israel
Multiplicity codes are a generalization of Reed-Muller codes which include derivatives as well as the values of low degree polynomials, evaluated in every point in 𝔽_p^m. Similarly to Reed-Muller codes, multiplicity codes have a local nature that allows for local correction and local testing. Recently, [Karliner et al., 2022] showed that the plane test, which tests the degree of the codeword on a random plane, is a good local tester for small enough degrees. In this work we simplify and extend the analysis of local testing for multiplicity codes, giving a more general and tight analysis. In particular, we show that multiplicity codes MRM_p(m, d, s) over prime fields with arbitrary d are locally testable by an appropriate k-flat test, which tests the degree of the codeword on a random k-dimensional affine subspace. The relationship between the degree parameter d and the required dimension k is shown to be nearly optimal, and improves on [Karliner et al., 2022] in the case of planes.
Our analysis relies on a generalization of the technique of canonincal monomials introduced in [Haramaty et al., 2013]. Generalizing canonical monomials to the multiplicity case requires substantially different proofs which exploit the algebraic structure of multiplicity codes.
https://drops.dagstuhl.de/storage/00lipics/lipics-vol245-approx-random2022/LIPIcs.APPROX-RANDOM.2022.11/LIPIcs.APPROX-RANDOM.2022.11.pdf
local testing
multiplicity codes
Reed Muller codes
eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2022-09-15
245
12:1
12:14
10.4230/LIPIcs.APPROX/RANDOM.2022.12
article
Unbalanced Expanders from Multiplicity Codes
Kalev, Itay
1
Ta-Shma, Amnon
1
https://orcid.org/0000-0001-8186-3622
Department of Computer Science, Tel Aviv University, Israel
In 2007 Guruswami, Umans and Vadhan gave an explicit construction of a lossless condenser based on Parvaresh-Vardy codes. This lossless condenser is a basic building block in many constructions, and, in particular, is behind the state of the art extractor constructions.
We give an alternative construction that is based on Multiplicity codes. While the bottom-line result is similar to the GUV result, the analysis is very different. In GUV (and Parvaresh-Vardy codes) the polynomial ring is closed to a finite field, and every polynomial is associated with related elements in the finite field. In our construction a polynomial from the polynomial ring is associated with its iterated derivatives. Our analysis boils down to solving a differential equation over a finite field, and uses previous techniques, introduced by Kopparty (in [Swastik Kopparty, 2015]) for the list-decoding setting. We also observe that these (and more general) questions were studied in differential algebra, and we use the terminology and result developed there.
We believe these techniques have the potential of getting better constructions and solving the current bottlenecks in the area.
https://drops.dagstuhl.de/storage/00lipics/lipics-vol245-approx-random2022/LIPIcs.APPROX-RANDOM.2022.12/LIPIcs.APPROX-RANDOM.2022.12.pdf
Condensers
Multiplicity codes
Differential equations
eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2022-09-15
245
13:1
13:23
10.4230/LIPIcs.APPROX/RANDOM.2022.13
article
Streaming Algorithms with Large Approximation Factors
Li, Yi
1
https://orcid.org/0000-0002-6420-653X
Lin, Honghao
2
Woodruff, David P.
2
Zhang, Yuheng
3
Division of Mathematical Sciences, Nanyang Technological University, Singapore
Computer Science Department, Carnegie Mellon University, Pittsburgh, PA, USA
Zhiyuan College, Shanghai Jiao Tong University, China
We initiate a broad study of classical problems in the streaming model with insertions and deletions in the setting where we allow the approximation factor α to be much larger than 1. Such algorithms can use significantly less memory than the usual setting for which α = 1+ε for an ε ∈ (0,1). We study large approximations for a number of problems in sketching and streaming, assuming that the underlying n-dimensional vector has all coordinates bounded by M throughout the data stream:
1) For the 𝓁_p norm/quasi-norm, 0 < p ≤ 2, we show that obtaining a poly(n)-approximation requires the same amount of memory as obtaining an O(1)-approximation for any M = n^Θ(1), which holds even for randomly ordered streams or for streams in the bounded deletion model.
2) For estimating the 𝓁_p norm, p > 2, we show an upper bound of O(n^{1-2/p} (log n log M)/α²) bits for an α-approximation, and give a matching lower bound for linear sketches.
3) For the 𝓁₂-heavy hitters problem, we show that the known lower bound of Ω(k log nlog M) bits for identifying (1/k)-heavy hitters holds even if we are allowed to output items that are 1/(α k)-heavy, provided the algorithm succeeds with probability 1-O(1/n). We also obtain a lower bound for linear sketches that is tight even for constant failure probability algorithms.
4) For estimating the number 𝓁₀ of distinct elements, we give an n^{1/t}-approximation algorithm using O(tlog log M) bits of space, as well as a lower bound of Ω(t) bits, both excluding the storage of random bits, where n is the dimension of the underlying frequency vector and M is an upper bound on the magnitude of its coordinates.
5) For α-approximation to the Schatten-p norm, we give near-optimal Õ(n^{2-4/p}/α⁴) sketching dimension for every even integer p and every α ≥ 1, while for p not an even integer we obtain near-optimal sketching dimension once α = Ω(n^{1/q-1/p}), where q is the largest even integer less than p. The latter is surprising as it is unknown what the complexity of Schatten-p norm estimation is for constant approximation; we show once the approximation factor is at least n^{1/q-1/p}, we can obtain near-optimal sketching bounds.
https://drops.dagstuhl.de/storage/00lipics/lipics-vol245-approx-random2022/LIPIcs.APPROX-RANDOM.2022.13/LIPIcs.APPROX-RANDOM.2022.13.pdf
streaming algorithms
𝓁_p norm
heavy hitters
distinct elements
eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2022-09-15
245
14:1
14:20
10.4230/LIPIcs.APPROX/RANDOM.2022.14
article
Local Stochastic Algorithms for Alignment in Self-Organizing Particle Systems
Kedia, Hridesh
1
Oh, Shunhao
1
Randall, Dana
1
https://orcid.org/0000-0002-1152-2627
Georgia Institute of Technology, Atlanta, GA, USA
We present local distributed, stochastic algorithms for alignment in self-organizing particle systems (SOPS) on two-dimensional lattices, where particles occupy unique sites on the lattice, and particles can make spatial moves to neighboring sites if they are unoccupied. Such models are abstractions of programmable matter, composed of individual computational particles with limited memory, strictly local communication abilities, and modest computational capabilities. We consider oriented particle systems, where particles are assigned a vector pointing in one of q directions, and each particle can compute the angle between its direction and the direction of any neighboring particle, although without knowledge of global orientation with respect to a fixed underlying coordinate system. Particles move stochastically, with each particle able to either modify its direction or make a local spatial move along a lattice edge during a move. We consider two settings: (a) where particle configurations must remain simply connected at all times and (b) where spatial moves are unconstrained and configurations can disconnect.
Our algorithms are inspired by the Potts model and its planar oriented variant known as the planar Potts model or clock model from statistical physics. We prove that for any q ≥ 2, by adjusting a single parameter, these self-organizing particle systems can be made to collectively align along a single dominant direction (analogous to a solid or ordered state) or remain non-aligned, in which case the fraction of particles oriented along any direction is nearly equal (analogous to a gaseous or disordered state). In the connected SOPS setting, we allow for two distinct parameters, one controlling the ferromagnetic attraction between neighboring particles (regardless of orientation) and the other controlling the preference of neighboring particles to align. We show that with appropriate settings of the input parameters, we can achieve compression and expansion, controlling how tightly gathered the particles are, as well as alignment or nonalignment, producing a single dominant orientation or not. While alignment is known for the Potts and clock models at sufficiently low temperatures, our proof in the SOPS setting are significantly more challenging because the particles make spatial moves, not all sites are occupied, and the total number of particles is fixed.
https://drops.dagstuhl.de/storage/00lipics/lipics-vol245-approx-random2022/LIPIcs.APPROX-RANDOM.2022.14/LIPIcs.APPROX-RANDOM.2022.14.pdf
Self-organizing particle systems
alignment
Markov chains
active matter
eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2022-09-15
245
15:1
15:14
10.4230/LIPIcs.APPROX/RANDOM.2022.15
article
Tight Chernoff-Like Bounds Under Limited Independence
Skorski, Maciej
1
https://orcid.org/0000-0003-2997-7539
University of Luxembourg, Luxembourg
This paper develops sharp bounds on moments of sums of k-wise independent bounded random variables, under constrained average variance. The result closes the problem addressed in part in the previous works of Schmidt et al. and Bellare, Rompel. The work also discusses other applications of independent interests, such as asymptotically sharp bounds on binomial moments.
https://drops.dagstuhl.de/storage/00lipics/lipics-vol245-approx-random2022/LIPIcs.APPROX-RANDOM.2022.15/LIPIcs.APPROX-RANDOM.2022.15.pdf
concentration inequalities
tail bounds
limited independence
k-wise independence
eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2022-09-15
245
16:1
16:24
10.4230/LIPIcs.APPROX/RANDOM.2022.16
article
Eigenstripping, Spectral Decay, and Edge-Expansion on Posets
Gaitonde, Jason
1
Hopkins, Max
2
Kaufman, Tali
3
Lovett, Shachar
2
Zhang, Ruizhe
4
Cornell University, Ithaca, NY, USA
University of California, San Diego, La Jolla, CA, USA
Bar-Ilan University, Ramat-Gan, Israel
The University of Texas at Austin, TX, USA
Fast mixing of random walks on hypergraphs (simplicial complexes) has recently led to myriad breakthroughs throughout theoretical computer science. Many important applications, however, (e.g. to LTCs, 2-2 games) rely on a more general class of underlying structures called posets, and crucially take advantage of non-simplicial structure. These works make it clear that the global expansion properties of posets depend strongly on their underlying architecture (e.g. simplicial, cubical, linear algebraic), but the overall phenomenon remains poorly understood. In this work, we quantify the advantage of different poset architectures in both a spectral and combinatorial sense, highlighting how regularity controls the spectral decay and edge-expansion of corresponding random walks.
We show that the spectra of walks on expanding posets (Dikstein, Dinur, Filmus, Harsha APPROX-RANDOM 2018) concentrate in strips around a small number of approximate eigenvalues controlled by the regularity of the underlying poset. This gives a simple condition to identify poset architectures (e.g. the Grassmann) that exhibit strong (even exponential) decay of eigenvalues, versus architectures like hypergraphs whose eigenvalues decay linearly - a crucial distinction in applications to hardness of approximation and agreement testing such as the recent proof of the 2-2 Games Conjecture (Khot, Minzer, Safra FOCS 2018). We show these results lead to a tight characterization of edge-expansion on expanding posets in the 𝓁₂-regime (generalizing recent work of Bafna, Hopkins, Kaufman, and Lovett (SODA 2022)), and pay special attention to the case of the Grassmann where we show our results are tight for a natural set of sparsifications of the Grassmann graphs. We note for clarity that our results do not recover the characterization of expansion used in the proof of the 2-2 Games Conjecture which relies on 𝓁_∞ rather than 𝓁₂-structure.
https://drops.dagstuhl.de/storage/00lipics/lipics-vol245-approx-random2022/LIPIcs.APPROX-RANDOM.2022.16/LIPIcs.APPROX-RANDOM.2022.16.pdf
High-dimensional expanders
posets
eposets
eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2022-09-15
245
17:1
17:15
10.4230/LIPIcs.APPROX/RANDOM.2022.17
article
Accelerating Polarization via Alphabet Extension
Duursma, Iwan
1
https://orcid.org/0000-0002-2436-3944
Gabrys, Ryan
2
https://orcid.org/0000-0002-9197-3371
Guruswami, Venkatesan
3
https://orcid.org/0000-0001-7926-3396
Lin, Ting-Chun
2
4
https://orcid.org/0000-0002-8994-4598
Wang, Hsin-Po
2
https://orcid.org/0000-0003-2574-1510
University of Illinois Urbana-Champaign, IL, USA
University of California San Diego, CA, USA
University of California, Berkeley, CA, USA
Hon Hai (Foxconn) Research Institute, Taipei, Taiwan
Polarization is an unprecedented coding technique in that it not only achieves channel capacity, but also does so at a faster speed of convergence than any other technique. This speed is measured by the "scaling exponent" and its importance is three-fold. Firstly, estimating the scaling exponent is challenging and demands a deeper understanding of the dynamics of communication channels. Secondly, scaling exponents serve as a benchmark for different variants of polar codes that helps us select the proper variant for real-life applications. Thirdly, the need to optimize for the scaling exponent sheds light on how to reinforce the design of polar code.
In this paper, we generalize the binary erasure channel (BEC), the simplest communication channel and the protagonist of many polar code studies, to the "tetrahedral erasure channel" (TEC). We then invoke Mori-Tanaka’s 2 × 2 matrix over 𝔽_4 to construct polar codes over TEC. Our main contribution is showing that the dynamic of TECs converges to an almost-one-parameter family of channels, which then leads to an upper bound of 3.328 on the scaling exponent. This is the first non-binary matrix whose scaling exponent is upper-bounded. It also polarizes BEC faster than all known binary matrices up to 23 × 23 in size. Our result indicates that expanding the alphabet is a more effective and practical alternative to enlarging the matrix in order to achieve faster polarization.
https://drops.dagstuhl.de/storage/00lipics/lipics-vol245-approx-random2022/LIPIcs.APPROX-RANDOM.2022.17/LIPIcs.APPROX-RANDOM.2022.17.pdf
polar code
scaling exponent
eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2022-09-15
245
18:1
18:23
10.4230/LIPIcs.APPROX/RANDOM.2022.18
article
Sketching Distances in Monotone Graph Classes
Esperet, Louis
1
https://orcid.org/0000-0001-6200-0514
Harms, Nathaniel
2
https://orcid.org/0000-0003-0259-9355
Kupavskii, Andrey
1
3
4
Univ. Grenoble Alpes, CNRS, Laboratoire G-SCOP, Grenoble, France
University of Waterloo, Canada
Moscow Institute of Physics and Technology, Russia
Huawei R&D Moscow, Russia
We study the problems of adjacency sketching, small-distance sketching, and approximate distance threshold (ADT) sketching for monotone classes of graphs. The algorithmic problem is to assign random sketches to the vertices of any graph G in the class, so that adjacency, exact distance thresholds, or approximate distance thresholds of two vertices u,v can be decided (with probability at least 2/3) from the sketches of u and v, by a decoder that does not know the graph. The goal is to determine when sketches of constant size exist.
Our main results are that, for monotone classes of graphs: constant-size adjacency sketches exist if and only if the class has bounded arboricity; constant-size small-distance sketches exist if and only if the class has bounded expansion; constant-size ADT sketches imply that the class has bounded expansion; any class of constant expansion (i.e. any proper minor closed class) has a constant-size ADT sketch; and a class may have arbitrarily small expansion without admitting a constant-size ADT sketch.
https://drops.dagstuhl.de/storage/00lipics/lipics-vol245-approx-random2022/LIPIcs.APPROX-RANDOM.2022.18/LIPIcs.APPROX-RANDOM.2022.18.pdf
adjacency labelling
informative labelling
distance sketching
adjacency sketching
communication complexity
eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2022-09-15
245
19:1
19:9
10.4230/LIPIcs.APPROX/RANDOM.2022.19
article
Communication Complexity of Collision
Göös, Mika
1
Jain, Siddhartha
1
https://orcid.org/0000-0003-2142-5801
EPFL, Lausanne, Switzerland
The Collision problem is to decide whether a given list of numbers (x_1,…,x_n) ∈ [n]ⁿ is 1-to-1 or 2-to-1 when promised one of them is the case. We show an n^Ω(1) randomised communication lower bound for the natural two-party version of Collision where Alice holds the first half of the bits of each x_i and Bob holds the second half. As an application, we also show a similar lower bound for a weak bit-pigeonhole search problem, which answers a question of Itsykson and Riazanov (CCC 2021).
https://drops.dagstuhl.de/storage/00lipics/lipics-vol245-approx-random2022/LIPIcs.APPROX-RANDOM.2022.19/LIPIcs.APPROX-RANDOM.2022.19.pdf
Collision
Communication complexity
Lifting
eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2022-09-15
245
20:1
20:21
10.4230/LIPIcs.APPROX/RANDOM.2022.20
article
Range Avoidance for Low-Depth Circuits and Connections to Pseudorandomness
Guruswami, Venkatesan
1
Lyu, Xin
1
Wang, Xiuhan
2
University of California Berkeley, CA, USA
Tsinghua University, Beijing, China
In the range avoidance problem, the input is a multi-output Boolean circuit with more outputs than inputs, and the goal is to find a string outside its range (which is guaranteed to exist). We show that well-known explicit construction questions such as finding binary linear codes achieving the Gilbert-Varshamov bound or list-decoding capacity, and constructing rigid matrices, reduce to the range avoidance problem of log-depth circuits, and by a further recent reduction [Ren, Santhanam, and Wang, FOCS 2022] to NC⁰₄ circuits where each output depends on at most 4 input bits.
On the algorithmic side, we show that range avoidance for NC⁰₂ circuits can be solved in polynomial time. We identify a general condition relating to correlation with low-degree parities that implies that any almost pairwise independent set has some string that avoids the range of every circuit in the class. We apply this to NC⁰ circuits, and to small width CNF/DNF and general De Morgan formulae (via a connection to approximate-degree), yielding non-trivial small hitting sets for range avoidance in these cases.
https://drops.dagstuhl.de/storage/00lipics/lipics-vol245-approx-random2022/LIPIcs.APPROX-RANDOM.2022.20/LIPIcs.APPROX-RANDOM.2022.20.pdf
Pseudorandomness
Explicit constructions
Low-depth circuits
Boolean function analysis
Hitting sets
eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2022-09-15
245
21:1
21:22
10.4230/LIPIcs.APPROX/RANDOM.2022.21
article
Learning Generalized Depth Three Arithmetic Circuits in the Non-Degenerate Case
Bhargava, Vishwas
1
Garg, Ankit
2
Kayal, Neeraj
2
Saha, Chandan
3
Department of Computer Science, Rutgers University, Piscataway, NJ, USA
Microsoft Research, Bangalore, India
Indian Institute of Science, Bangalore, India
Consider a homogeneous degree d polynomial f = T₁ + ⋯ + T_s, T_i = g_i(𝓁_{i,1}, …, 𝓁_{i, m}) where g_i’s are homogeneous m-variate degree d polynomials and 𝓁_{i,j}’s are linear polynomials in n variables. We design a (randomized) learning algorithm that given black-box access to f, computes black-boxes for the T_i’s. The running time of the algorithm is poly(n, m, d, s) and the algorithm works under some non-degeneracy conditions on the linear forms and the g_i’s, and some additional technical assumptions n ≥ (md)², s ≤ n^{d/4}. The non-degeneracy conditions on 𝓁_{i,j}’s constitute non-membership in a variety, and hence are satisfied when the coefficients of 𝓁_{i,j}’s are chosen uniformly and randomly from a large enough set. The conditions on g_i’s are satisfied for random polynomials and also for natural polynomials common in the study of arithmetic complexity like determinant, permanent, elementary symmetric polynomial, iterated matrix multiplication. A particularly appealing algorithmic corollary is the following: Given black-box access to an f = Det_r(L^(1)) + … + Det_r(L^(s)), where L^(k) = (𝓁_{i,j}^(k))_{i,j} with 𝓁_{i,j}^(k)’s being linear forms in n variables chosen randomly, there is an algorithm which in time poly(n, r) outputs matrices (M^(k))_k of linear forms s.t. there exists a permutation π: [s] → [s] with Det_r(M^(k)) = Det_r(L^(π(k))).
Our work follows the works [Neeraj Kayal and Chandan Saha, 2019; Garg et al., 2020] which use lower bound methods in arithmetic complexity to design average case learning algorithms. It also vastly generalizes the result in [Neeraj Kayal and Chandan Saha, 2019] about learning depth three circuits, which is a special case where each g_i is just a monomial. At the core of our algorithm is the partial derivative method which can be used to prove lower bounds for generalized depth three circuits. To apply the general framework in [Neeraj Kayal and Chandan Saha, 2019; Garg et al., 2020], we need to establish that the non-degeneracy conditions arising out of applying the framework with the partial derivative method are satisfied in the random case. We develop simple but general and powerful tools to establish this, which might be useful in designing average case learning algorithms for other arithmetic circuit models.
https://drops.dagstuhl.de/storage/00lipics/lipics-vol245-approx-random2022/LIPIcs.APPROX-RANDOM.2022.21/LIPIcs.APPROX-RANDOM.2022.21.pdf
Arithemtic Circuits
Average-case Learning
Depth 3 Arithmetic Circuits
Learning Algorithms
Learning Circuits
Circuit Reconstruction
eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2022-09-15
245
22:1
22:24
10.4230/LIPIcs.APPROX/RANDOM.2022.22
article
Lower Bound Methods for Sign-Rank and Their Limitations
Hatami, Hamed
1
Hatami, Pooya
2
https://orcid.org/0000-0001-7928-8008
Pires, William
1
Tao, Ran
3
Zhao, Rosie
1
School of Computer Science, McGill University, Montreal, Canada
Department of Computer Science and Engineering, The Ohio State University, Columbus, OH, USA
Department of Mathematics and Statistics, McGill University, Montreal, Canada
The sign-rank of a matrix A with ±1 entries is the smallest rank of a real matrix with the same sign pattern as A. To the best of our knowledge, there are only three known methods for proving lower bounds on the sign-rank of explicit matrices: (i) Sign-rank is at least the VC-dimension; (ii) Forster’s method, which states that sign-rank is at least the inverse of the largest possible average margin among the representations of the matrix by points and half-spaces; (iii) Sign-rank is at least a logarithmic function of the density of the largest monochromatic rectangle.
We prove several results regarding the limitations of these methods.
- We prove that, qualitatively, the monochromatic rectangle density is the strongest of these three lower bounds. If it fails to provide a super-constant lower bound for the sign-rank of a matrix, then the other two methods will fail as well.
- We show that there exist n × n matrices with sign-rank n^Ω(1) for which none of these methods can provide a super-constant lower bound.
- We show that sign-rank is at most an exponential function of the deterministic communication complexity with access to an equality oracle. We combine this result with Green and Sanders' quantitative version of Cohen’s idempotent theorem to show that for a large class of sign matrices (e.g., xor-lifts), sign-rank is at most an exponential function of the γ₂ norm of the matrix. We conjecture that this holds for all sign matrices.
- Towards answering a question of Linial, Mendelson, Schechtman, and Shraibman regarding the relation between sign-rank and discrepancy, we conjecture that sign-ranks of the ±1 adjacency matrices of hypercube graphs can be arbitrarily large. We prove that none of the three lower bound techniques can resolve this conjecture in the affirmative.
https://drops.dagstuhl.de/storage/00lipics/lipics-vol245-approx-random2022/LIPIcs.APPROX-RANDOM.2022.22/LIPIcs.APPROX-RANDOM.2022.22.pdf
Average Margin
Communication complexity
margin complexity
monochromatic rectangle
Sign-rank
Unbounded-error communication complexity
VC-dimension
eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2022-09-15
245
23:1
23:22
10.4230/LIPIcs.APPROX/RANDOM.2022.23
article
Black-Box Identity Testing of Noncommutative Rational Formulas of Inversion Height Two in Deterministic Quasipolynomial Time
Arvind, V.
1
Chatterjee, Abhranil
2
Mukhopadhyay, Partha
3
The Institute of Mathematical Sciences, HBNI, Chennai, India
Indian Institute of Technology Bombay, India
Chennai Mathematical Institute, India
Hrubeš and Wigderson [Hrubeš and Wigderson, 2015] initiated the complexity-theoretic study of noncommutative formulas with inverse gates. They introduced the Rational Identity Testing (RIT) problem which is to decide whether a noncommutative rational formula computes zero in the free skew field. In the white-box setting, there are deterministic polynomial-time algorithms due to Garg, Gurvits, Oliveira, and Wigderson [Ankit Garg et al., 2016] and Ivanyos, Qiao, and Subrahmanyam [Ivanyos et al., 2018].
A central open problem in this area is to design an efficient deterministic black-box identity testing algorithm for rational formulas. In this paper, we solve this for the first nested inverse case. More precisely, we obtain a deterministic quasipolynomial-time black-box RIT algorithm for noncommutative rational formulas of inversion height two via a hitting set construction. Several new technical ideas are involved in the hitting set construction, including concepts from matrix coefficient realization theory [Volčič, 2018] and properties of cyclic division algebras [T.Y. Lam, 2001]. En route to the proof, an important step is to embed the hitting set of Forbes and Shpilka for noncommutative formulas [Michael A. Forbes and Amir Shpilka, 2013] inside a cyclic division algebra of small index.
https://drops.dagstuhl.de/storage/00lipics/lipics-vol245-approx-random2022/LIPIcs.APPROX-RANDOM.2022.23/LIPIcs.APPROX-RANDOM.2022.23.pdf
Rational Identity Testing
Black-box Derandomization
Cyclic Division Algebra
Matrix coefficient realization theory
eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2022-09-15
245
24:1
24:15
10.4230/LIPIcs.APPROX/RANDOM.2022.24
article
Sampling from Potts on Random Graphs of Unbounded Degree via Random-Cluster Dynamics
Blanca, Antonio
1
Gheissari, Reza
2
Department of CSE, Pennsylvania State University, University Park, PA, USA
Department of Statistics and EECS, University of California, Berkeley, CA, USA
We consider the problem of sampling from the ferromagnetic Potts and random-cluster models on a general family of random graphs via the Glauber dynamics for the random-cluster model. The random-cluster model is parametrized by an edge probability p ∈ (0,1) and a cluster weight q > 0. We establish that for every q ≥ 1, the random-cluster Glauber dynamics mixes in optimal Θ(nlog n) steps on n-vertex random graphs having a prescribed degree sequence with bounded average branching γ throughout the full high-temperature uniqueness regime p < p_u(q,γ).
The family of random graph models we consider includes the Erdős-Rényi random graph G(n,γ/n), and so we provide the first polynomial-time sampling algorithm for the ferromagnetic Potts model on Erdős-Rényi random graphs for the full tree uniqueness regime. We accompany our results with mixing time lower bounds (exponential in the largest degree) for the Potts Glauber dynamics, in the same settings where our Θ(n log n) bounds for the random-cluster Glauber dynamics apply. This reveals a novel and significant computational advantage of random-cluster based algorithms for sampling from the Potts model at high temperatures.
https://drops.dagstuhl.de/storage/00lipics/lipics-vol245-approx-random2022/LIPIcs.APPROX-RANDOM.2022.24/LIPIcs.APPROX-RANDOM.2022.24.pdf
Potts model
random-cluster model
random graphs
Markov chains
mixing time
tree uniqueness
eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2022-09-15
245
25:1
25:17
10.4230/LIPIcs.APPROX/RANDOM.2022.25
article
Improved Bounds for Randomly Colouring Simple Hypergraphs
Feng, Weiming
1
https://orcid.org/0000-0003-4636-1023
Guo, Heng
1
https://orcid.org/0000-0001-8199-5596
Wang, Jiaheng
1
https://orcid.org/0000-0002-5191-545X
School of Informatics, University of Edinburgh, UK
We study the problem of sampling almost uniform proper q-colourings in k-uniform simple hypergraphs with maximum degree Δ. For any δ > 0, if k ≥ 20(1+δ)/δ and q ≥ 100Δ^({2+δ}/{k-4/δ-4}), the running time of our algorithm is Õ(poly(Δ k)⋅ n^1.01), where n is the number of vertices. Our result requires fewer colours than previous results for general hypergraphs (Jain, Pham, and Vuong, 2021; He, Sun, and Wu, 2021), and does not require Ω(log n) colours unlike the work of Frieze and Anastos (2017).
https://drops.dagstuhl.de/storage/00lipics/lipics-vol245-approx-random2022/LIPIcs.APPROX-RANDOM.2022.25/LIPIcs.APPROX-RANDOM.2022.25.pdf
Approximate counting
Markov chain
Mixing time
Hypergraph colouring
eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2022-09-15
245
26:1
26:17
10.4230/LIPIcs.APPROX/RANDOM.2022.26
article
Lifting with Inner Functions of Polynomial Discrepancy
Manor, Yahel
1
Meir, Or
1
https://orcid.org/0000-0001-5031-0750
Department of Computer Science, University of Haifa, Israel
Lifting theorems are theorems that bound the communication complexity of a composed function f∘gⁿ in terms of the query complexity of f and the communication complexity of g. Such theorems constitute a powerful generalization of direct-sum theorems for g, and have seen numerous applications in recent years.
We prove a new lifting theorem that works for every two functions f,g such that the discrepancy of g is at most inverse polynomial in the input length of f. Our result is a significant generalization of the known direct-sum theorem for discrepancy, and extends the range of inner functions g for which lifting theorems hold.
https://drops.dagstuhl.de/storage/00lipics/lipics-vol245-approx-random2022/LIPIcs.APPROX-RANDOM.2022.26/LIPIcs.APPROX-RANDOM.2022.26.pdf
Lifting
communication complexity
query complexity
discrepancy
eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2022-09-15
245
27:1
27:23
10.4230/LIPIcs.APPROX/RANDOM.2022.27
article
Exploring the Gap Between Tolerant and Non-Tolerant Distribution Testing
Chakraborty, Sourav
1
Fischer, Eldar
2
Ghosh, Arijit
1
Mishra, Gopinath
3
Sen, Sayantan
1
Indian Statistical Institute, Kolkata, India
Technion - Israel Institute of Technology, Haifa, Israel
University of Warwick, Coventry, UK
The framework of distribution testing is currently ubiquitous in the field of property testing. In this model, the input is a probability distribution accessible via independently drawn samples from an oracle. The testing task is to distinguish a distribution that satisfies some property from a distribution that is far in some distance measure from satisfying it. The task of tolerant testing imposes a further restriction, that distributions close to satisfying the property are also accepted.
This work focuses on the connection between the sample complexities of non-tolerant testing of distributions and their tolerant testing counterparts. When limiting our scope to label-invariant (symmetric) properties of distributions, we prove that the gap is at most quadratic, ignoring poly-logarithmic factors. Conversely, the property of being the uniform distribution is indeed known to have an almost-quadratic gap.
When moving to general, not necessarily label-invariant properties, the situation is more complicated, and we show some partial results. We show that if a property requires the distributions to be non-concentrated, that is, the probability mass of the distribution is sufficiently spread out, then it cannot be non-tolerantly tested with o(√n) many samples, where n denotes the universe size. Clearly, this implies at most a quadratic gap, because a distribution can be learned (and hence tolerantly tested against any property) using 𝒪(n) many samples. Being non-concentrated is a strong requirement on properties, as we also prove a close to linear lower bound against their tolerant tests.
Apart from the case where the distribution is non-concentrated, we also show if an input distribution is very concentrated, in the sense that it is mostly supported on a subset of size s of the universe, then it can be learned using only 𝒪(s) many samples. The learning procedure adapts to the input, and works without knowing s in advance.
https://drops.dagstuhl.de/storage/00lipics/lipics-vol245-approx-random2022/LIPIcs.APPROX-RANDOM.2022.27/LIPIcs.APPROX-RANDOM.2022.27.pdf
Distribution Testing
Tolerant Testing
Non-tolerant Testing
Sample Complexity
eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2022-09-15
245
28:1
28:18
10.4230/LIPIcs.APPROX/RANDOM.2022.28
article
A Sublinear Local Access Implementation for the Chinese Restaurant Process
Mörters, Peter
1
https://orcid.org/0000-0002-8917-3789
Sohler, Christian
1
https://orcid.org/0000-0001-8990-3326
Walzer, Stefan
1
https://orcid.org/0000-0002-6477-0106
Universität zu Köln, Germany
The Chinese restaurant process is a stochastic process closely related to the Dirichlet process that groups sequentially arriving objects into a variable number of classes, such that within each class objects are cyclically ordered. A popular description involves a restaurant, where customers arrive one by one and either sit down next to a randomly chosen customer at one of the existing tables or open a new table. The full state of the process after n steps is given by a permutation of the n objects and cannot be represented in sublinear space. In particular, if we only need specific information about a few objects or classes it would be preferable to obtain the answers without simulating the process completely.
A recent line of research [Oded Goldreich et al., 2010; Moni Naor and Asaf Nussboim, 2007; Amartya Shankha Biswas et al., 2020; Guy Even et al., 2021] attempts to provide access to huge random objects without fully instantiating them. Such local access implementations provide answers to a sequence of queries about the random object, following the same distribution as if the object was fully generated. In this paper, we provide a local access implementation for a generalization of the Chinese restaurant process described above. Our implementation can be used to answer any sequence of adaptive queries about class affiliation of objects, number and sizes of classes at any time, position of elements within a class, or founding time of a class. The running time per query is polylogarithmic in the total size of the object, with high probability. Our approach relies on some ideas from the recent local access implementation for preferential attachment trees by Even et al. [Guy Even et al., 2021]. Such trees are related to the Chinese restaurant process in the sense that both involve a "rich-get-richer" phenomenon. A novel ingredient in our implementation is to embed the process in continuous time, in which the evolution of the different classes becomes stochastically independent [Joyce and Tavaré, 1987]. This independence is used to keep the probabilistic structure manageable even if many queries have already been answered. As similar embeddings are available for a wide range of urn processes [Krishna B. Athreya and Samuel Karlin, 1968], we believe that our approach may be applicable more generally. Moreover, local access implementations for birth and death processes that we encounter along the way may be of independent interest.
https://drops.dagstuhl.de/storage/00lipics/lipics-vol245-approx-random2022/LIPIcs.APPROX-RANDOM.2022.28/LIPIcs.APPROX-RANDOM.2022.28.pdf
Chinese restaurant process
Dirichlet process
sublinear time algorithm
random recursive tree
random permutation
random partition
Ewens distribution
simulation
local access implementation
continuous time embedding
eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2022-09-15
245
29:1
29:22
10.4230/LIPIcs.APPROX/RANDOM.2022.29
article
A Fully Adaptive Strategy for Hamiltonian Cycles in the Semi-Random Graph Process
Gao, Pu
1
MacRury, Calum
2
Prałat, Paweł
3
Department of Combinatorics and Optimization, University of Waterloo, Canada
Department of Computer Science, University of Toronto, Canada
Department of Mathematics, Toronto Metropolitan University, Canada
The semi-random graph process is a single player game in which the player is initially presented an empty graph on n vertices. In each round, a vertex u is presented to the player independently and uniformly at random. The player then adaptively selects a vertex v, and adds the edge uv to the graph. For a fixed monotone graph property, the objective of the player is to force the graph to satisfy this property with high probability in as few rounds as possible.
We focus on the problem of constructing a Hamiltonian cycle in as few rounds as possible. In particular, we present an adaptive strategy for the player which achieves it in α n rounds, where α < 2.01678 is derived from the solution to some system of differential equations. We also show that the player cannot achieve the desired property in less than β n rounds, where β > 1.26575. These results improve the previously best known bounds and, as a result, the gap between the upper and lower bounds is decreased from 1.39162 to 0.75102.
https://drops.dagstuhl.de/storage/00lipics/lipics-vol245-approx-random2022/LIPIcs.APPROX-RANDOM.2022.29/LIPIcs.APPROX-RANDOM.2022.29.pdf
Random graphs and processes
Online adaptive algorithms
Hamiltonian cycles
Differential equation method
eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2022-09-15
245
30:1
30:19
10.4230/LIPIcs.APPROX/RANDOM.2022.30
article
Cover and Hitting Times of Hyperbolic Random Graphs
Kiwi, Marcos
1
https://orcid.org/0000-0003-4171-2656
Schepers, Markus
2
https://orcid.org/0000-0003-2436-1540
Sylvester, John
3
https://orcid.org/0000-0002-6543-2934
Department of Industrial Engineering and Center for Mathematical Modeling, Universidad de Chile, Santiago, Chile
Institut für Medizinische Biometrie, Epidemiologie und Informatik, Johannes-Gutenberg-University Mainz, Germany
School of Computing Science, University of Glasgow, UK
We study random walks on the giant component of Hyperbolic Random Graphs (HRGs), in the regime when the degree distribution obeys a power law with exponent in the range (2,3). In particular, we focus on the expected times for a random walk to hit a given vertex or visit, i.e. cover, all vertices. We show that up to multiplicative constants: the cover time is n(log n)², the maximum hitting time is nlog n, and the average hitting time is n. The first two results hold in expectation and a.a.s. and the last in expectation (with respect to the HRG).
We prove these results by determining the effective resistance either between an average vertex and the well-connected "center" of HRGs or between an appropriately chosen collection of extremal vertices. We bound the effective resistance by the energy dissipated by carefully designed network flows associated to a tiling of the hyperbolic plane on which we overlay a forest-like structure.
https://drops.dagstuhl.de/storage/00lipics/lipics-vol245-approx-random2022/LIPIcs.APPROX-RANDOM.2022.30/LIPIcs.APPROX-RANDOM.2022.30.pdf
Random walk
hyperbolic random graph
cover time
hitting time
average hitting time
target time
effective resistance
Kirchhoff index
eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2022-09-15
245
31:1
31:21
10.4230/LIPIcs.APPROX/RANDOM.2022.31
article
Adaptive Sketches for Robust Regression with Importance Sampling
Mahabadi, Sepideh
1
Woodruff, David P.
2
Zhou, Samson
2
https://orcid.org/0000-0001-8288-5698
Microsoft Research, Redmond, WA, USA
Carnegie Mellon University, Pittsburgh, PA, USA
We introduce data structures for solving robust regression through stochastic gradient descent (SGD) by sampling gradients with probability proportional to their norm, i.e., importance sampling. Although SGD is widely used for large scale machine learning, it is well-known for possibly experiencing slow convergence rates due to the high variance from uniform sampling. On the other hand, importance sampling can significantly decrease the variance but is usually difficult to implement because computing the sampling probabilities requires additional passes over the data, in which case standard gradient descent (GD) could be used instead. In this paper, we introduce an algorithm that approximately samples T gradients of dimension d from nearly the optimal importance sampling distribution for a robust regression problem over n rows. Thus our algorithm effectively runs T steps of SGD with importance sampling while using sublinear space and just making a single pass over the data. Our techniques also extend to performing importance sampling for second-order optimization.
https://drops.dagstuhl.de/storage/00lipics/lipics-vol245-approx-random2022/LIPIcs.APPROX-RANDOM.2022.31/LIPIcs.APPROX-RANDOM.2022.31.pdf
Streaming algorithms
stochastic optimization
importance sampling
eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2022-09-15
245
32:1
32:14
10.4230/LIPIcs.APPROX/RANDOM.2022.32
article
Finding the KT Partition of a Weighted Graph in Near-Linear Time
Apers, Simon
1
Gawrychowski, Paweł
2
Lee, Troy
3
CNRS and IRIF, Paris, France
Institute of Computer Science, University of Wrocław, Poland
Centre for Quantum Software and Information, University of Technology Sydney, Australia
In a breakthrough work, Kawarabayashi and Thorup (J. ACM'19) gave a near-linear time deterministic algorithm to compute the weight of a minimum cut in a simple graph G = (V,E). A key component of this algorithm is finding the (1+ε)-KT partition of G, the coarsest partition {P_1, …, P_k} of V such that for every non-trivial (1+ε)-near minimum cut with sides {S, ̄{S}} it holds that P_i is contained in either S or ̄{S}, for i = 1, …, k. In this work we give a near-linear time randomized algorithm to find the (1+ε)-KT partition of a weighted graph. Our algorithm is quite different from that of Kawarabayashi and Thorup and builds on Karger’s framework of tree-respecting cuts (J. ACM'00).
We describe a number of applications of the algorithm. (i) The algorithm makes progress towards a more efficient algorithm for constructing the polygon representation of the set of near-minimum cuts in a graph. This is a generalization of the cactus representation, and was initially described by Benczúr (FOCS'95). (ii) We improve the time complexity of a recent quantum algorithm for minimum cut in a simple graph in the adjacency list model from Õ(n^{3/2}) to Õ(√{mn}), when the graph has n vertices and m edges. (iii) We describe a new type of randomized algorithm for minimum cut in simple graphs with complexity 𝒪(m + n log⁶ n). For graphs that are not too sparse, this matches the complexity of the current best 𝒪(m + n log² n) algorithm which uses a different approach based on random contractions.
The key technical contribution of our work is the following. Given a weighted graph G with m edges and a spanning tree T of G, consider the graph H whose nodes are the edges of T, and where there is an edge between two nodes of H iff the corresponding 2-respecting cut of T is a non-trivial near-minimum cut of G. We give a 𝒪(m log⁴ n) time deterministic algorithm to compute a spanning forest of H.
https://drops.dagstuhl.de/storage/00lipics/lipics-vol245-approx-random2022/LIPIcs.APPROX-RANDOM.2022.32/LIPIcs.APPROX-RANDOM.2022.32.pdf
Graph theory
eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2022-09-15
245
33:1
33:24
10.4230/LIPIcs.APPROX/RANDOM.2022.33
article
Maximum Matching Sans Maximal Matching: A New Approach for Finding Maximum Matchings in the Data Stream Model
Feldman, Moran
1
https://orcid.org/0000-0002-1535-2979
Szarf, Ariel
2
Department of Computer Science, University of Haifa, Israel
Department of Mathematics and Computer Science, Open University of Israel, Ra'anana, Israel
The problem of finding a maximum size matching in a graph (known as the maximum matching problem) is one of the most classical problems in computer science. Despite a significant body of work dedicated to the study of this problem in the data stream model, the state-of-the-art single-pass semi-streaming algorithm for it is still a simple greedy algorithm that computes a maximal matching, and this way obtains 1/2-approximation. Some previous works described two/three-pass algorithms that improve over this approximation ratio by using their second and third passes to improve the above mentioned maximal matching. One contribution of this paper continues this line of work by presenting new three-pass semi-streaming algorithms that work along these lines and obtain improved approximation ratios of 0.6111 and 0.5694 for triangle-free and general graphs, respectively.
Unfortunately, a recent work [Christian Konrad and Kheeran K. Naidu, 2021] shows that the strategy of constructing a maximal matching in the first pass and then improving it in further passes has limitations. Additionally, this technique is unlikely to get us closer to single-pass semi-streaming algorithms obtaining a better than 1/2-approximation. Therefore, it is interesting to come up with algorithms that do something else with their first pass (we term such algorithms non-maximal-matching-first algorithms). No such algorithms are currently known (to the best of our knowledge), and the main contribution of this paper is describing such algorithms that obtain approximation ratios of 0.5384 and 0.5555 in two and three passes, respectively, for general graphs (the result for three passes improves over the previous state-of-the-art, but is worse than the result of this paper mentioned in the previous paragraph for general graphs). The improvements obtained by these results are, unfortunately, numerically not very impressive, but the main importance (in our opinion) of these results is in demonstrating the potential of non-maximal-matching-first algorithms.
https://drops.dagstuhl.de/storage/00lipics/lipics-vol245-approx-random2022/LIPIcs.APPROX-RANDOM.2022.33/LIPIcs.APPROX-RANDOM.2022.33.pdf
Maximum matching
semi-streaming algorithms
multi-pass algorithms
eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2022-09-15
245
34:1
34:22
10.4230/LIPIcs.APPROX/RANDOM.2022.34
article
Ordered k-Median with Outliers
Deng, Shichuan
1
Zhang, Qianfan
1
IIIS, Tsinghua University, Beijing, China
We study a natural generalization of the celebrated ordered k-median problem, named robust ordered k-median, also known as ordered k-median with outliers. We are given facilities ℱ and clients 𝒞 in a metric space (ℱ∪𝒞,d), parameters k,m ∈ ℤ_+ and a non-increasing non-negative vector w ∈ ℝ_+^m. We seek to open k facilities F ⊆ ℱ and serve m clients C ⊆ 𝒞, inducing a service cost vector c = {d(j,F):j ∈ C}; the goal is to minimize the ordered objective w^⊤c^↓, where d(j,F) = min_{i ∈ F}d(j,i) is the minimum distance between client j and facilities in F, and c^↓ ∈ ℝ_+^m is the non-increasingly sorted version of c. Robust ordered k-median captures many interesting clustering problems recently studied in the literature, e.g., robust k-median, ordered k-median, etc.
We obtain the first polynomial-time constant-factor approximation algorithm for robust ordered k-median, achieving an approximation guarantee of 127. The main difficulty comes from the presence of outliers, which already causes an unbounded integrality gap in the natural LP relaxation for robust k-median. This appears to invalidate previous methods in approximating the highly non-linear ordered objective. To overcome this issue, we introduce a novel yet very simple reduction framework that enables linear analysis of the non-linear objective. We also devise the first constant-factor approximations for ordered matroid median and ordered knapsack median using the same framework, and the approximation factors are 19.8 and 41.6, respectively.
https://drops.dagstuhl.de/storage/00lipics/lipics-vol245-approx-random2022/LIPIcs.APPROX-RANDOM.2022.34/LIPIcs.APPROX-RANDOM.2022.34.pdf
clustering
approximation algorithm
design and analysis of algorithms
eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2022-09-15
245
35:1
35:17
10.4230/LIPIcs.APPROX/RANDOM.2022.35
article
Sketching Approximability of (Weak) Monarchy Predicates
Chou, Chi-Ning
1
Golovnev, Alexander
2
Shahrasbi, Amirbehshad
3
Sudan, Madhu
4
Velusamy, Santhoshini
1
School of Engineering and Applied Sciences, Harvard University, Cambridge, MA, USA
Department of Computer Science, Georgetown University, Washington, D.C., USA
Microsoft, Redmond, WA, USA
School of Engineering and Applied Sciences, Harvard University, Cambridge, MA USA
We analyze the sketching approximability of constraint satisfaction problems on Boolean domains, where the constraints are balanced linear threshold functions applied to literals. In particular, we explore the approximability of monarchy-like functions where the value of the function is determined by a weighted combination of the vote of the first variable (the president) and the sum of the votes of all remaining variables. The pure version of this function is when the president can only be overruled by when all remaining variables agree. For every k ≥ 5, we show that CSPs where the underlying predicate is a pure monarchy function on k variables have no non-trivial sketching approximation algorithm in o(√n) space. We also show infinitely many weaker monarchy functions for which CSPs using such constraints are non-trivially approximable by O(log(n)) space sketching algorithms. Moreover, we give the first example of sketching approximable asymmetric Boolean CSPs. Our results work within the framework of Chou, Golovnev, Sudan, and Velusamy (FOCS 2021) that characterizes the sketching approximability of all CSPs. Their framework can be applied naturally to get a computer-aided analysis of the approximability of any specific constraint satisfaction problem. The novelty of our work is in using their work to get an analysis that applies to infinitely many problems simultaneously.
https://drops.dagstuhl.de/storage/00lipics/lipics-vol245-approx-random2022/LIPIcs.APPROX-RANDOM.2022.35/LIPIcs.APPROX-RANDOM.2022.35.pdf
sketching algorithms
approximability
linear threshold functions
eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2022-09-15
245
36:1
36:13
10.4230/LIPIcs.APPROX/RANDOM.2022.36
article
Integrality Gap of Time-Indexed Linear Programming Relaxation for Coflow Scheduling
Fukunaga, Takuro
1
https://orcid.org/0000-0003-3285-2876
Faculty of Science and Engineering, Chuo University, Tokyo, Japan
Coflow is a set of related parallel data flows in a network. The goal of the coflow scheduling is to process all the demands of the given coflows while minimizing the weighted completion time. It is known that the coflow scheduling problem admits several polynomial-time 5-approximation algorithms that compute solutions by rounding linear programming (LP) relaxations of the problem. In this paper, we investigate the time-indexed LP relaxation for coflow scheduling. We show that the integrality gap of the time-indexed LP relaxation is at most 4. We also show that yet another polynomial-time 5-approximation algorithm can be obtained by rounding the solutions to the time-indexed LP relaxation.
https://drops.dagstuhl.de/storage/00lipics/lipics-vol245-approx-random2022/LIPIcs.APPROX-RANDOM.2022.36/LIPIcs.APPROX-RANDOM.2022.36.pdf
coflow scheduling
hypergraph matching
approximation algorithm
eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2022-09-15
245
37:1
37:19
10.4230/LIPIcs.APPROX/RANDOM.2022.37
article
Fair Correlation Clustering in General Graphs
Schwartz, Roy
1
Zats, Roded
1
The Henry and Marilyn Taub Faculty of Computer Science, Technion, Haifa, Israel
We consider the family of Correlation Clustering optimization problems under fairness constraints. In Correlation Clustering we are given a graph whose every edge is labeled either with a + or a -, and the goal is to find a clustering that agrees the most with the labels: + edges within clusters and - edges across clusters. The notion of fairness implies that there is no over, or under, representation of vertices in the clustering: every vertex has a color and the distribution of colors within each cluster is required to be the same as the distribution of colors in the input graph. Previously, approximation algorithms were known only for fair disagreement minimization in complete unweighted graphs. We prove the following: (1) there is no finite approximation for fair disagreement minimization in general graphs unless P = NP (this hardness holds also for bicriteria algorithms); and (2) fair agreement maximization in general graphs admits a bicriteria approximation of ≈ 0.591 (an improved ≈ 0.609 true approximation is given for the special case of two uniformly distributed colors). Our algorithm is based on proving that the sticky Brownian motion rounding of [Abbasi Zadeh-Bansal-Guruganesh-Nikolov-Schwartz-Singh SODA'20] copes well with uncut edges.
https://drops.dagstuhl.de/storage/00lipics/lipics-vol245-approx-random2022/LIPIcs.APPROX-RANDOM.2022.37/LIPIcs.APPROX-RANDOM.2022.37.pdf
Correlation Clustering
Approximation Algorithms
Semi-Definite Programming
eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2022-09-15
245
38:1
38:23
10.4230/LIPIcs.APPROX/RANDOM.2022.38
article
On Sketching Approximations for Symmetric Boolean CSPs
Boyland, Joanna
1
Hwang, Michael
1
Prasad, Tarun
1
https://orcid.org/0000-0002-3706-4230
Singer, Noah
2
1
https://orcid.org/0000-0002-0076-521X
Velusamy, Santhoshini
3
https://orcid.org/0000-0002-0294-5425
Harvard College, Harvard University, Cambridge, MA, USA
Department of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA
School of Engineering and Applied Sciences, Harvard University, Cambridge, MA, USA
A Boolean maximum constraint satisfaction problem, Max-CSP(f), is specified by a predicate f:{-1,1}^k → {0,1}. An n-variable instance of Max-CSP(f) consists of a list of constraints, each of which applies f to k distinct literals drawn from the n variables. For k = 2, Chou, Golovnev, and Velusamy [Chou et al., 2020] obtained explicit ratios characterizing the √ n-space streaming approximability of every predicate. For k ≥ 3, Chou, Golovnev, Sudan, and Velusamy [Chou et al., 2022] proved a general dichotomy theorem for √ n-space sketching algorithms: For every f, there exists α(f) ∈ (0,1] such that for every ε > 0, Max-CSP(f) is (α(f)-ε)-approximable by an O(log n)-space linear sketching algorithm, but (α(f)+ε)-approximation sketching algorithms require Ω(√n) space.
In this work, we give closed-form expressions for the sketching approximation ratios of multiple families of symmetric Boolean functions. Letting α'_k = 2^{-(k-1)} (1-k^{-2})^{(k-1)/2}, we show that for odd k ≥ 3, α(kAND) = α'_k, and for even k ≥ 2, α(kAND) = 2α'_{k+1}. Thus, for every k, kAND can be (2-o(1))2^{-k}-approximated by O(log n)-space sketching algorithms; we contrast this with a lower bound of Chou, Golovnev, Sudan, Velingker, and Velusamy [Chou et al., 2022] implying that streaming (2+ε)2^{-k}-approximations require Ω(n) space! We also resolve the ratio for the "at-least-(k-1)-1’s" function for all even k; the "exactly-(k+1)/2-1’s" function for odd k ∈ {3,…,51}; and fifteen other functions. We stress here that for general f, the dichotomy theorem in [Chou et al., 2022] only implies that α(f) can be computed to arbitrary precision in PSPACE, and thus closed-form expressions need not have existed a priori. Our analyses involve identifying and exploiting structural "saddle-point" properties of this dichotomy.
Separately, for all threshold functions, we give optimal "bias-based" approximation algorithms generalizing [Chou et al., 2020] while simplifying [Chou et al., 2022]. Finally, we investigate the √ n-space streaming lower bounds in [Chou et al., 2022], and show that they are incomplete for 3AND, i.e., they fail to rule out (α(3AND})-ε)-approximations in o(√ n) space.
https://drops.dagstuhl.de/storage/00lipics/lipics-vol245-approx-random2022/LIPIcs.APPROX-RANDOM.2022.38/LIPIcs.APPROX-RANDOM.2022.38.pdf
Streaming algorithms
constraint satisfaction problems
approximability
eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2022-09-15
245
39:1
39:28
10.4230/LIPIcs.APPROX/RANDOM.2022.39
article
Massively Parallel Algorithms for Small Subgraph Counting
Biswas, Amartya Shankha
1
Eden, Talya
2
3
Liu, Quanquan C.
4
Rubinfeld, Ronitt
1
Mitrović, Slobodan
5
CSAIL, MIT, Cambridge, MA, USA
CSAIL, MIT, Cambridge MA, USA
Boston University, MA, USA
Northwestern University, Evanston, IL, USA
University of California Davis, CA, USA
Over the last two decades, frameworks for distributed-memory parallel computation, such as MapReduce, Hadoop, Spark and Dryad, have gained significant popularity with the growing prevalence of large network datasets. The Massively Parallel Computation (MPC) model is the de-facto standard for studying graph algorithms in these frameworks theoretically. Subgraph counting is one such fundamental problem in analyzing massive graphs, with the main algorithmic challenges centering on designing methods which are both scalable and accurate.
Given a graph G = (V, E) with n vertices, m edges and T triangles, our first result is an algorithm that outputs a (1+ε)-approximation to T, with asymptotically optimal round and total space complexity provided any S ≥ max{(√ m, n²/m)} space per machine and assuming T = Ω(√{m/n}). Our result gives a quadratic improvement on the bound on T over previous works. We also provide a simple extension of our result to counting any subgraph of k size for constant k ≥ 1. Our second result is an O_δ(log log n)-round algorithm for exactly counting the number of triangles, whose total space usage is parametrized by the arboricity α of the input graph. We extend this result to exactly counting k-cliques for any constant k. Finally, we prove that a recent result of Bera, Pashanasangi and Seshadhri (ITCS 2020) for exactly counting all subgraphs of size at most 5 can be implemented in the MPC model in Õ_δ(√{log n}) rounds, O(n^δ) space per machine and O(mα³) total space.
In addition to our theoretical results, we simulate our triangle counting algorithms in real-world graphs obtained from the Stanford Network Analysis Project (SNAP) database. Our results show that both our approximate and exact counting algorithms exhibit improvements in terms of round complexity and approximation ratio, respectively, compared to two previous widely used algorithms for these problems.
https://drops.dagstuhl.de/storage/00lipics/lipics-vol245-approx-random2022/LIPIcs.APPROX-RANDOM.2022.39/LIPIcs.APPROX-RANDOM.2022.39.pdf
triangle counting
massively parallel computation
clique counting
approximation algorithms
subgraph counting
eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2022-09-15
245
40:1
40:14
10.4230/LIPIcs.APPROX/RANDOM.2022.40
article
Hardness Results for Weaver’s Discrepancy Problem
Spielman, Daniel A.
1
Zhang, Peng
2
Yale University, New Haven, CT, USA
Rutgers University, Piscataway, NJ, USA
Marcus, Spielman and Srivastava (Annals of Mathematics 2014) solved the Kadison-Singer Problem by proving a strong form of Weaver’s conjecture: they showed that for all α > 0 and all lists of vectors of norm at most √α whose outer products sum to the identity, there exists a signed sum of those outer products with operator norm at most √{8α} + 2α. We prove that it is NP-hard to distinguish such a list of vectors for which there is a signed sum that equals the zero matrix from those in which every signed sum has operator norm at least η √α, for some absolute constant η > 0. Thus, it is NP-hard to construct a signing that is a constant factor better than that guaranteed to exist.
For α = 1/4, we prove that it is NP-hard to distinguish whether there is a signed sum that equals the zero matrix from the case in which every signed sum has operator norm at least 1/4.
https://drops.dagstuhl.de/storage/00lipics/lipics-vol245-approx-random2022/LIPIcs.APPROX-RANDOM.2022.40/LIPIcs.APPROX-RANDOM.2022.40.pdf
Discrepancy Problem
Kadison-Singer Problem
Hardness of Approximation
eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2022-09-15
245
41:1
41:19
10.4230/LIPIcs.APPROX/RANDOM.2022.41
article
Relative Survivable Network Design
Dinitz, Michael
1
Koranteng, Ama
1
Kortsarz, Guy
2
Johns Hopkins University, Baltimore, MD, USA
Rutgers University, Camden, NJ, USA
One of the most important and well-studied settings for network design is edge-connectivity requirements. This encompasses uniform demands such as the Minimum k-Edge-Connected Spanning Subgraph problem (k-ECSS), as well as nonuniform demands such as the Survivable Network Design problem. A weakness of these formulations, though, is that we are not able to ask for fault-tolerance larger than the connectivity. Taking inspiration from recent definitions and progress in graph spanners, we introduce and study new variants of these problems under a notion of relative fault-tolerance. Informally, we require not that two nodes are connected if there are a bounded number of faults (as in the classical setting), but that two nodes are connected if there are a bounded number of faults and the two nodes are connected in the underlying graph post-faults. That is, the subgraph we build must "behave" identically to the underlying graph with respect to connectivity after bounded faults.
We define and introduce these problems, and provide the first approximation algorithms: a (1+4/k)-approximation for the unweighted relative version of k-ECSS, a 2-approximation for the weighted relative version of k-ECSS, and a 27/4-approximation for the special case of Relative Survivable Network Design with only a single demand with a connectivity requirement of 3. To obtain these results, we introduce a number of technical ideas that may of independent interest. First, we give a generalization of Jain’s iterative rounding analysis that works even when the cut-requirement function is not weakly supermodular, but instead satisfies a weaker definition we introduce and term local weak supermodularity. Second, we prove a structure theorem and design an approximation algorithm utilizing a new decomposition based on important separators, which are structures commonly used in fixed-parameter algorithms that have not commonly been used in approximation algorithms.
https://drops.dagstuhl.de/storage/00lipics/lipics-vol245-approx-random2022/LIPIcs.APPROX-RANDOM.2022.41/LIPIcs.APPROX-RANDOM.2022.41.pdf
Fault Tolerance
Network Design
eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2022-09-15
245
42:1
42:7
10.4230/LIPIcs.APPROX/RANDOM.2022.42
article
Bypassing the XOR Trick: Stronger Certificates for Hypergraph Clique Number
Guruswami, Venkatesan
1
Kothari, Pravesh K.
2
Manohar, Peter
2
University of California Berkeley, CA, USA
Carnegie Mellon University, Pittsburgh, PA, USA
Let H(k,n,p) be the distribution on k-uniform hypergraphs where every subset of [n] of size k is included as an hyperedge with probability p independently. In this work, we design and analyze a simple spectral algorithm that certifies a bound on the size of the largest clique, ω(H), in hypergraphs H ∼ H(k,n,p). For example, for any constant p, with high probability over the choice of the hypergraph, our spectral algorithm certifies a bound of Õ(√n) on the clique number in polynomial time. This matches, up to polylog(n) factors, the best known certificate for the clique number in random graphs, which is the special case of k = 2.
Prior to our work, the best known refutation algorithms [Amin Coja-Oghlan et al., 2004; Sarah R. Allen et al., 2015] rely on a reduction to the problem of refuting random k-XOR via Feige’s XOR trick [Uriel Feige, 2002], and yield a polynomially worse bound of Õ(n^{3/4}) on the clique number when p = O(1). Our algorithm bypasses the XOR trick and relies instead on a natural generalization of the Lovász theta semidefinite programming relaxation for cliques in hypergraphs.
https://drops.dagstuhl.de/storage/00lipics/lipics-vol245-approx-random2022/LIPIcs.APPROX-RANDOM.2022.42/LIPIcs.APPROX-RANDOM.2022.42.pdf
Planted clique
Average-case complexity
Spectral refutation
Random matrix theory
eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2022-09-15
245
43:1
43:16
10.4230/LIPIcs.APPROX/RANDOM.2022.43
article
Approximating CSPs with Outliers
Ghoshal, Suprovat
1
Louis, Anand
2
University of Michgan, Ann Arbor, MI, USA
Indian Institute of Science, Bangalore, India
Constraint satisfaction problems (CSPs) are ubiquitous in theoretical computer science. We study the problem of Strong-CSP s, i.e. instances where a large induced sub-instance has a satisfying assignment. More formally, given a CSP instance 𝒢(V, E, [k], {Π_{ij}}_{(i,j) ∈ E}) consisting of a set of vertices V, a set of edges E, alphabet [k], a constraint Π_{ij} ⊂ [k] × [k] for each (i,j) ∈ E, the goal of this problem is to compute the largest subset S ⊆ V such that the instance induced on S has an assignment that satisfies all the constraints.
In this paper, we study approximation algorithms for UniqueGames and related problems under the Strong-CSP framework when the underlying constraint graph satisfies mild expansion properties. In particular, we show that given a StrongUniqueGames instance whose optimal solution S^* is supported on a regular low threshold rank graph, there exists an algorithm that runs in time exponential in the threshold rank, and recovers a large satisfiable sub-instance whose size is independent on the label set size and maximum degree of the graph. Our algorithm combines the techniques of Barak-Raghavendra-Steurer (FOCS'11), Guruswami-Sinop (FOCS'11) with several new ideas and runs in time exponential in the threshold rank of the optimal set. A key component of our algorithm is a new threshold rank based spectral decomposition, which is used to compute a "large" induced subgraph of "small" threshold rank; our techniques build on the work of Oveis Gharan and Rezaei (SODA'17), and could be of independent interest.
https://drops.dagstuhl.de/storage/00lipics/lipics-vol245-approx-random2022/LIPIcs.APPROX-RANDOM.2022.43/LIPIcs.APPROX-RANDOM.2022.43.pdf
Constraint Satisfaction Problems
Strong Unique Games
Threshold Rank
eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2022-09-15
245
44:1
44:21
10.4230/LIPIcs.APPROX/RANDOM.2022.44
article
Submodular Dominance and Applications
Qiu, Frederick
1
Singla, Sahil
2
Department of Computer Science, Princeton University, NJ, USA
School of Computer Science, Georgia Tech, Atlanta, GA, USA
In submodular optimization we often deal with the expected value of a submodular function f on a distribution 𝒟 over sets of elements. In this work we study such submodular expectations for negatively dependent distributions. We introduce a natural notion of negative dependence, which we call Weak Negative Regression (WNR), that generalizes both Negative Association and Negative Regression. We observe that WNR distributions satisfy Submodular Dominance, whereby the expected value of f under 𝒟 is at least the expected value of f under a product distribution with the same element-marginals.
Next, we give several applications of Submodular Dominance to submodular optimization. In particular, we improve the best known submodular prophet inequalities, we develop new rounding techniques for polytopes of set systems that admit negatively dependent distributions, and we prove existence of contention resolution schemes for WNR distributions.
https://drops.dagstuhl.de/storage/00lipics/lipics-vol245-approx-random2022/LIPIcs.APPROX-RANDOM.2022.44/LIPIcs.APPROX-RANDOM.2022.44.pdf
Submodular Optimization
Negative Dependence
Negative Association
Weak Negative Regression
Submodular Dominance
Submodular Prophet Inequality
eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2022-09-15
245
45:1
45:17
10.4230/LIPIcs.APPROX/RANDOM.2022.45
article
Online Facility Location with Linear Delay
Bienkowski, Marcin
1
https://orcid.org/0000-0002-2453-7772
Böhm, Martin
1
https://orcid.org/0000-0003-4796-7422
Byrka, Jarosław
1
https://orcid.org/0000-0002-3387-0913
Marcinkowski, Jan
1
https://orcid.org/0000-0002-6517-0014
Institute of Computer Science, University of Wrocław, Poland
In the problem of online facility location with delay, a sequence of n clients appear in the metric space, and they need to be eventually connected to some open facility. The clients do not have to be connected immediately, but such a choice comes with a certain penalty: each client incurs a waiting cost (equal to the difference between its arrival and its connection time). At any point in time, an algorithm may decide to open a facility and connect any subset of clients to it. That is, an algorithm needs to balance three types of costs: cost of opening facilities, costs of connecting clients, and the waiting costs of clients. We study a natural variant of this problem, where clients may be connected also to an already open facility, but such action incurs an extra cost: an algorithm pays for waiting of the facility (a cost incurred separately for each such "late" connection). This is reminiscent of online matching with delays, where both sides of the connection incur a waiting cost. We call this variant two-sided delay to differentiate it from the previously studied one-sided delay, where clients may connect to a facility only at its opening time.
We present an O(1)-competitive deterministic algorithm for the two-sided delay variant. Our approach is an extension of the approach used by Jain, Mahdian and Saberi [STOC 2002] for analyzing the performance of offline algorithms for facility location. To this end, we substantially simplify the part of the original argument in which a bound on the sequence of factor-revealing LPs is derived. We then show how to transform our O(1)-competitive algorithm for the two-sided delay variant to O(log n / log log n)-competitive deterministic algorithm for one-sided delays. This improves the known O(log n) bound by Azar and Touitou [FOCS 2020]. We note that all previous online algorithms for problems with delays in general metrics have at least logarithmic ratios.
https://drops.dagstuhl.de/storage/00lipics/lipics-vol245-approx-random2022/LIPIcs.APPROX-RANDOM.2022.45/LIPIcs.APPROX-RANDOM.2022.45.pdf
online facility location
network design problems
facility location with delay
JMS algorithm
competitive analysis
factor revealing LP
eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2022-09-15
245
46:1
46:24
10.4230/LIPIcs.APPROX/RANDOM.2022.46
article
Prophet Matching in the Probe-Commit Model
Borodin, Allan
1
MacRury, Calum
1
Rakheja, Akash
1
Department of Computer Science, University of Toronto, Canada
We consider the online bipartite stochastic matching problem with known i.d. (independently distributed) online vertex arrivals. In this problem, when an online vertex arrives, its weighted edges must be probed (queried) to determine if they exist, based on known edge probabilities. Our algorithms operate in the probe-commit model, in that if a probed edge exists, it must be used in the matching. Additionally, each online node has a downward-closed probing constraint on its adjacent edges which indicates which sequences of edge probes are allowable. Our setting generalizes the commonly studied patience (or time-out) constraint which limits the number of probes that can be made to an online node’s adjacent edges. Most notably, this includes non-uniform edge probing costs (specified by knapsack/budget constraint). We extend a recently introduced configuration LP to the known i.d. setting, and also provide the first proof that it is a relaxation of an optimal offline probing algorithm (the offline adaptive benchmark). Using this LP, we establish the following competitive ratio results against the offline adaptive benchmark:
1) A tight 1/2 ratio when the arrival ordering π is chosen adversarially.
2) A 1-1/e ratio when the arrival ordering π is chosen u.a.r. (uniformly at random). If π is generated adversarially, we generalize the prophet inequality matching problem. If π is u.a.r., we generalize the prophet secretary matching problem. Both results improve upon the previous best competitive ratio of 0.46 in the more restricted known i.i.d. (independent and identically distributed) arrival model against the standard offline adaptive benchmark due to Brubach et al. We are the first to study the prophet secretary matching problem in the context of probing, and our 1-1/e ratio matches the best known result without probing due to Ehsani et al. This result also applies to the unconstrained bipartite matching probe-commit problem, where we match the best known result due to Gamlath et al.
https://drops.dagstuhl.de/storage/00lipics/lipics-vol245-approx-random2022/LIPIcs.APPROX-RANDOM.2022.46/LIPIcs.APPROX-RANDOM.2022.46.pdf
Stochastic probing
Online algorithms
Bipartite matching
Optimization under uncertainty
eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2022-09-15
245
47:1
47:14
10.4230/LIPIcs.APPROX/RANDOM.2022.47
article
The Biased Homogeneous r-Lin Problem
Ghoshal, Suprovat
1
University of Michigan, Ann Arbor, MI, USA
The p-biased Homogeneous r-Lin problem (Hom-r-Lin_p) is the following: given a homogeneous system of r-variable equations over m{F}₂, the goal is to find an assignment of relative weight p that satisfies the maximum number of equations. In a celebrated work, Håstad (JACM 2001) showed that the unconstrained variant of this i.e., Max-3-Lin, is hard to approximate beyond a factor of 1/2. This is also tight due to the naive random guessing algorithm which sets every variable uniformly from {0,1}. Subsequently, Holmerin and Khot (STOC 2004) showed that the same holds for the balanced Hom-r-Lin problem as well. In this work, we explore the approximability of the Hom-r-Lin_p problem beyond the balanced setting (i.e., p ≠ 1/2), and investigate whether the (p-biased) random guessing algorithm is optimal for every p. Our results include the following:
- The Hom-r-Lin_p problem has no efficient 1/2 + 1/2 (1 - 2p)^{r-2} + ε-approximation algorithm for every p if r is even, and for p ∈ (0,1/2] if r is odd, unless NP ⊂ ∪_{ε>0}DTIME(2^{n^ε}).
- For any r and any p, there exists an efficient 1/2 (1 - e^{-2})-approximation algorithm for Hom-r-Lin_p. We show that this is also tight for odd values of r (up to o_r(1)-additive factors) assuming the Unique Games Conjecture. Our results imply that when r is even, then for large values of r, random guessing is near optimal for every p. On the other hand, when r is odd, our results illustrate an interesting contrast between the regimes p ∈ (0,1/2) (where random guessing is near optimal) and p → 1 (where random guessing is far from optimal). A key technical contribution of our work is a generalization of Håstad’s 3-query dictatorship test to the p-biased setting.
https://drops.dagstuhl.de/storage/00lipics/lipics-vol245-approx-random2022/LIPIcs.APPROX-RANDOM.2022.47/LIPIcs.APPROX-RANDOM.2022.47.pdf
Biased Approximation Resistance
Constraint Satisfaction Problems
eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2022-09-15
245
48:1
48:20
10.4230/LIPIcs.APPROX/RANDOM.2022.48
article
Asymptotically Optimal Bounds for Estimating H-Index in Sublinear Time with Applications to Subgraph Counting
Assadi, Sepehr
1
Nguyen, Hoai-An
1
Department of Computer Science, Rutgers University, Piscataway, NJ, USA
The h-index is a metric used to measure the impact of a user in a publication setting, such as a member of a social network with many highly liked posts or a researcher in an academic domain with many highly cited publications. Specifically, the h-index of a user is the largest integer h such that at least h publications of the user have at least h units of positive feedback.
We design an algorithm that, given query access to the n publications of a user and each publication’s corresponding positive feedback number, outputs a (1± ε)-approximation of the h-index of this user with probability at least 1-δ in time O(n⋅ln(1/δ) / (ε²⋅h)), where h is the actual h-index which is unknown to the algorithm a-priori. We then design a novel lower bound technique that allows us to prove that this bound is in fact asymptotically optimal for this problem in all parameters n,h,ε, and δ.
Our work is one of the first in sublinear time algorithms that addresses obtaining asymptotically optimal bounds, especially in terms of the error and confidence parameters. As such, we focus on designing novel techniques for this task. In particular, our lower bound technique seems quite general - to showcase this, we also use our approach to prove an asymptotically optimal lower bound for the problem of estimating the number of triangles in a graph in sublinear time, which now is also optimal in the error and confidence parameters. This latter result improves upon prior lower bounds of Eden, Levi, Ron, and Seshadhri (FOCS'15) for this problem, as well as multiple follow-up works that extended this lower bound to other subgraph counting problems.
https://drops.dagstuhl.de/storage/00lipics/lipics-vol245-approx-random2022/LIPIcs.APPROX-RANDOM.2022.48/LIPIcs.APPROX-RANDOM.2022.48.pdf
Sublinear time algorithms
h-index
asymptotically optimal bounds
lower bounds
subgraph counting
eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2022-09-15
245
49:1
49:19
10.4230/LIPIcs.APPROX/RANDOM.2022.49
article
Maximizing a Submodular Function with Bounded Curvature Under an Unknown Knapsack Constraint
Klimm, Max
1
Knaack, Martin
1
Institute for Mathematics, Technische Universität Berlin, Germany
This paper studies the problem of maximizing a monotone submodular function under an unknown knapsack constraint. A solution to this problem is a policy that decides which item to pack next based on the past packing history. The robustness factor of a policy is the worst case ratio of the solution obtained by following the policy and an optimal solution that knows the knapsack capacity. We develop an algorithm with a robustness factor that is decreasing in the curvature c of the submodular function. For the extreme cases c = 0 corresponding to a modular objective, it matches a previously known and best possible robustness factor of 1/2. For the other extreme case of c = 1 it yields a robustness factor of ≈ 0.35 improving over the best previously known robustness factor of ≈ 0.06.
https://drops.dagstuhl.de/storage/00lipics/lipics-vol245-approx-random2022/LIPIcs.APPROX-RANDOM.2022.49/LIPIcs.APPROX-RANDOM.2022.49.pdf
submodular function
knapsack
approximation algorithm
robust optimization
eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2022-09-15
245
50:1
50:16
10.4230/LIPIcs.APPROX/RANDOM.2022.50
article
Some Results on Approximability of Minimum Sum Vertex Cover
Stanković, Aleksa
1
https://orcid.org/0000-0002-8416-8665
Department of Mathematics, KTH Royal Institute of Technology, Stockholm, Sweden
We study the Minimum Sum Vertex Cover problem, which asks for an ordering of vertices in a graph that minimizes the total cover time of edges. In particular, n vertices of the graph are visited according to an ordering, and for each edge this induces the first time it is covered. The goal of the problem is to find the ordering which minimizes the sum of the cover times over all edges in the graph.
In this work we give the first explicit hardness of approximation result for Minimum Sum Vertex Cover. In particular, assuming the Unique Games Conjecture, we show that the Minimum Sum Vertex Cover problem cannot be approximated within 1.014. The best approximation ratio for Minimum Sum Vertex Cover as of now is 16/9, due to a recent work of Bansal, Batra, Farhadi, and Tetali.
We also revisit an approximation algorithm for regular graphs outlined in the work of Feige, Lovász, and Tetali, and show that Minimum Sum Vertex Cover can be approximated within 1.225 on regular graphs.
https://drops.dagstuhl.de/storage/00lipics/lipics-vol245-approx-random2022/LIPIcs.APPROX-RANDOM.2022.50/LIPIcs.APPROX-RANDOM.2022.50.pdf
Hardness of approximation
approximability
approximation algorithms
Label Cover
Unique Games Conjecture
Vertex Cover
eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2022-09-15
245
51:1
51:23
10.4230/LIPIcs.APPROX/RANDOM.2022.51
article
(1+ε)-Approximate Shortest Paths in Dynamic Streams
Elkin, Michael
1
https://orcid.org/0000-0003-2034-812X
Trehan, Chhaya
2
https://orcid.org/0000-0002-3249-3212
Ben-Gurion University of the Negev, Beer-Sheva, Israel
London School of Economics & Political Science, UK
Computing approximate shortest paths in the dynamic streaming setting is a fundamental challenge that has been intensively studied. Currently existing solutions for this problem either build a sparse multiplicative spanner of the input graph and compute shortest paths in the spanner offline, or compute an exact single source BFS tree. Solutions of the first type are doomed to incur a stretch-space tradeoff of 2κ-1 versus n^{1+1/κ}, for an integer parameter κ. (In fact, existing solutions also incur an extra factor of 1+ε in the stretch for weighted graphs, and an additional factor of log^{O(1)}n in the space.) The only existing solution of the second type uses n^{1/2 - O(1/κ)} passes over the stream (for space O(n^{1+1/κ})), and applies only to unweighted graphs.
In this paper we show that (1+ε)-approximate single-source shortest paths can be computed with Õ(n^{1+1/κ}) space using just constantly many passes in unweighted graphs, and polylogarithmically many passes in weighted graphs. Moreover, the same result applies for multi-source shortest paths, as long as the number of sources is O(n^{1/κ}). We achieve these results by devising efficient dynamic streaming constructions of (1 + ε, β)-spanners and hopsets.
On our way to these results, we also devise a new dynamic streaming algorithm for the 1-sparse recovery problem. Even though our algorithm for this task is slightly inferior to the existing algorithms of [S. Ganguly, 2007; Graham Cormode and D. Firmani, 2013], we believe that it is of independent interest.
https://drops.dagstuhl.de/storage/00lipics/lipics-vol245-approx-random2022/LIPIcs.APPROX-RANDOM.2022.51/LIPIcs.APPROX-RANDOM.2022.51.pdf
Shortest Paths
Dynamic Streams
Approximate Distances
Spanners
Hopsets
eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2022-09-15
245
52:1
52:16
10.4230/LIPIcs.APPROX/RANDOM.2022.52
article
Caching with Reserves
Ibrahimpur, Sharat
1
Purohit, Manish
2
Svitkina, Zoya
2
Vee, Erik
2
Wang, Joshua R.
2
University of Waterloo, Canada
Google Research, Mountain View, CA, USA
Caching is among the most well-studied topics in algorithm design, in part because it is such a fundamental component of many computer systems. Much of traditional caching research studies cache management for a single-user or single-processor environment. In this paper, we propose two related generalizations of the classical caching problem that capture issues that arise in a multi-user or multi-processor environment. In the caching with reserves problem, a caching algorithm is required to maintain at least k_i pages belonging to user i in the cache at any time, for some given reserve capacities k_i. In the public-private caching problem, the cache of total size k is partitioned into subcaches, a private cache of size k_i for each user i and a shared public cache usable by any user. In both of these models, as in the classical caching framework, the objective of the algorithm is to dynamically maintain the cache so as to minimize the total number of cache misses.
We show that caching with reserves and public-private caching models are equivalent up to constant factors, and thus focus on the former. Unlike classical caching, both of these models turn out to be NP-hard even in the offline setting, where the page sequence is known in advance. For the offline setting, we design a 2-approximation algorithm, whose analysis carefully keeps track of a potential function to bound the cost. In the online setting, we first design an O(ln k)-competitive fractional algorithm using the primal-dual framework, and then show how to convert it online to a randomized integral algorithm with the same guarantee.
https://drops.dagstuhl.de/storage/00lipics/lipics-vol245-approx-random2022/LIPIcs.APPROX-RANDOM.2022.52/LIPIcs.APPROX-RANDOM.2022.52.pdf
Approximation Algorithms
Online Algorithms
Caching
eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2022-09-15
245
53:1
53:15
10.4230/LIPIcs.APPROX/RANDOM.2022.53
article
Space Optimal Vertex Cover in Dynamic Streams
Naidu, Kheeran K.
1
https://orcid.org/0000-0002-5946-4702
Shah, Vihan
2
Department of Computer Science, University of Bristol, UK
Department of Computer Science, Rutgers University, Piscataway, NJ, USA
We optimally resolve the space complexity for the problem of finding an α-approximate minimum vertex cover (αMVC) in dynamic graph streams. We give a randomised algorithm for αMVC which uses O(n²/α²) bits of space matching Dark and Konrad’s lower bound [CCC 2020] up to constant factors. By computing a random greedy matching, we identify "easy" instances of the problem which can trivially be solved by returning the entire vertex set. The remaining "hard" instances, then have sparse induced subgraphs which we exploit to get our space savings and solve αMVC.
Achieving this type of optimality result is crucial for providing a complete understanding of a problem, and it has been gaining interest within the dynamic graph streaming community. For connectivity, Nelson and Yu [SODA 2019] improved the lower bound showing that Ω(n log³ n) bits of space is necessary while Ahn, Guha, and McGregor [SODA 2012] have shown that O(n log³ n) bits is sufficient. For finding an α-approximate maximum matching, the upper bound was improved by Assadi and Shah [ITCS 2022] showing that O(n²/α³) bits is sufficient while Dark and Konrad [CCC 2020] have shown that Ω(n²/α³) bits is necessary. The space complexity, however, remains unresolved for many other dynamic graph streaming problems where further improvements can still be made.
https://drops.dagstuhl.de/storage/00lipics/lipics-vol245-approx-random2022/LIPIcs.APPROX-RANDOM.2022.53/LIPIcs.APPROX-RANDOM.2022.53.pdf
Graph Streaming Algorithms
Vertex Cover
Dynamic Streams
Approximation Algorithm
eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2022-09-15
245
54:1
54:21
10.4230/LIPIcs.APPROX/RANDOM.2022.54
article
Approximating LCS and Alignment Distance over Multiple Sequences
Das, Debarati
1
Saha, Barna
2
Pennsylvania state University, University Park, PA, USA
University of California, San Diego, CA, USA
We study the problem of aligning multiple sequences with the goal of finding an alignment that either maximizes the number of aligned symbols (the longest common subsequence (LCS) problem), or minimizes the number of unaligned symbols (the alignment distance aka the complement of LCS). Multiple sequence alignment is a well-studied problem in bioinformatics and is used routinely to identify regions of similarity among DNA, RNA, or protein sequences to detect functional, structural, or evolutionary relationships among them. It is known that exact computation of LCS or alignment distance of m sequences each of length n requires Θ(n^m) time unless the Strong Exponential Time Hypothesis is false. However, unlike the case of two strings, fast algorithms to approximate LCS and alignment distance of multiple sequences are lacking in the literature. A major challenge in this area is to break the triangle inequality. Specifically, by splitting m sequences into two (roughly) equal sized groups, then computing the alignment distance in each group and finally combining them by using triangle inequality, it is possible to achieve a 2-approximation in Õ_m(n^⌈m/2⌉) time. But, an approximation factor below 2 which would need breaking the triangle inequality barrier is not known in O(n^{α m}) time for any α < 1. We make significant progress in this direction.
First, we consider a semi-random model where, we show if just one out of m sequences is (p,B)-pseudorandom then, we can get a below-two approximation in Õ_m(nB^{m-1}+n^{⌊m/2⌋+3}) time. Such semi-random models are very well-studied for two strings scenario, however directly extending those works require one but all sequences to be pseudorandom, and would only give an O(1/p) approximation. We overcome these with significant new ideas. Specifically an ingredient to this proof is a new algorithm that achives below 2 approximations when alignment distance is large in Õ_m(n^{⌊m/2⌋+2}) time. This could be of independent interest.
Next, for LCS of m sequences each of length n, we show if the optimum LCS is λ n for some λ ∈ [0,1], then in Õ_m(n^{⌊m/2⌋+1}) time, we can return a common subsequence of length at least λ²n/(2+ε) for any arbitrary constant ε > 0. In contrast, for two strings, the best known subquadratic algorithm may return a common subsequence of length Θ(λ⁴ n).
https://drops.dagstuhl.de/storage/00lipics/lipics-vol245-approx-random2022/LIPIcs.APPROX-RANDOM.2022.54/LIPIcs.APPROX-RANDOM.2022.54.pdf
String Algorithms
Approximation Algorithms
eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2022-09-15
245
55:1
55:18
10.4230/LIPIcs.APPROX/RANDOM.2022.55
article
A Primal-Dual Algorithm for Multicommodity Flows and Multicuts in Treewidth-2 Graphs
Friedrich, Tobias
1
https://orcid.org/0000-0003-0076-6308
Issac, Davis
1
https://orcid.org/0000-0001-5559-7471
Kumar, Nikhil
1
https://orcid.org/0000-0001-8634-6237
Mallek, Nadym
1
https://orcid.org/0000-0002-4370-5145
Zeif, Ziena
1
https://orcid.org/0000-0003-0378-1458
Hasso Plattner Institute, Universität Potsdam, Germany
We study the problem of multicommodity flow and multicut in treewidth-2 graphs and prove bounds on the multiflow-multicut gap. In particular, we give a primal-dual algorithm for computing multicommodity flow and multicut in treewidth-2 graphs and prove the following approximate max-flow min-cut theorem: given a treewidth-2 graph, there exists a multicommodity flow of value f with congestion 4, and a multicut of capacity c such that c ≤ 20 f. This implies a multiflow-multicut gap of 80 and improves upon the previous best known bounds for such graphs. Our algorithm runs in polynomial time when all the edges have capacity one. Our algorithm is completely combinatorial and builds upon the primal-dual algorithm of Garg, Vazirani and Yannakakis for multicut in trees and the augmenting paths framework of Ford and Fulkerson.
https://drops.dagstuhl.de/storage/00lipics/lipics-vol245-approx-random2022/LIPIcs.APPROX-RANDOM.2022.55/LIPIcs.APPROX-RANDOM.2022.55.pdf
Approximation Algorithms
Multicommodity Flow
Multicut