Improved Parallel Derandomization via Finite Automata with Applications
Abstract
A central approach to algorithmic derandomization is the construction of small-support probability distributions that “fool” randomized algorithms, often enabling efficient parallel (NC) implementations. An abstraction of this idea is fooling polynomial-space statistical tests computed via finite automata [Sivakumar STOC’02]; this encompasses a wide range of properties including -wise independence and sums of random variables.
We present new parallel algorithms to fool finite-state automata, with significantly reduced processor complexity. Briefly, our approach is to iteratively sparsify distributions using a work-efficient lattice rounding routine and maintain accuracy by tracking an aggregate weighted error that is determined by the Lipschitz value of the statistical tests being fooled.
We illustrate with improved applications to the Gale-Berlekamp Switching Game and to approximate MAX-CUT via SDP rounding. These involve further several optimizations, such as the truncation of the state space of the automata and FFT-based convolutions to compute transition probabilities efficiently.
Keywords and phrases:
Parallel Algorithms, Derandomization, MAX-CUT, Gale-Berlekamp Switching GameFunding:
Jeff Giliberti: JG gratefully acknowledges financial support by the Fulbright U.S. Graduate Student Program, sponsored by the U.S. Department of State and the Italian-American Fulbright Commission. This content does not necessarily represent the views of the Program.Copyright and License:
2012 ACM Subject Classification:
Theory of computation Parallel algorithms ; Theory of computation Pseudorandomness and derandomizationAcknowledgements:
Thanks to Mohsen Ghaffari and Christoph Grunau for explaining the works [19] and [20].Editors:
Anne Benoit, Haim Kaplan, Sebastian Wild, and Grzegorz HermanSeries and Publisher:
Leibniz International Proceedings in Informatics, Schloss Dagstuhl – Leibniz-Zentrum für Informatik
1 Introduction
A fundamental problem in the theory of computation is to derandomize existing randomized algorithms. Such randomized algorithms typically use a large number of independent random bits. One main genre of derandomization is the construction of a probability distribution which is much smaller (of polynomial instead of exponential size), to “fool” the randomized algorithm. That is, the behavior of relevant statistics should be similar when presented with fully-independent random bits vs. bits drawn from a small, carefully-constructed correlated distribution. This probability space can be searched exhaustively, and in parallel, to find a specific input with desired properties.
A simple and popular example of this technique comes from a probability space with -wise independence, for constant [2, 30, 32, 35]. This preserves basic statistical properties such as means and variances [7, 42]. There are many other constructions for more-advanced properties, e.g., near--wise independence [3, 5, 14, 17, 38], probability spaces fooling halfspaces or polytopes [33, 41, 23, 22], ordered branching programs [12, 15, 37, 27, 40] etc.
A significant abstraction of these methods is the construction of probability spaces that fool statistical properties (also called “tests”) computed by finite automata [10, 24, 31, 36, 39, 40, 43]. Such tests are ubiquitous in randomized algorithm analysis, encompassing -wise independence, sums of random variables, and many other properties. For example, they are used in deterministic parallel algorithms for finding balanced independent sets in graphs [26], for applications of the Lovász Local Lemma in combinatorics [25], for covering and packing integer programs [6, 44], for undirected connectivity problems [40], and more.
1.1 Our contribution
We describe new parallel algorithms to fool automata, with significantly reduced processor complexity. The analysis is also simplified because we can cleanly separate out problem-specific optimizations from the general lattice discrepancy problems that are at the core of the algorithm. A summary of our algorithm performance is as follows:
Theorem 1 (Simplified).
Consider a probability space on binary random variables and a collection of statistical tests over , each with possible states and Lipschitz value .
There is a deterministic algorithm to generate a distribution of size to simultaneously “fool” all statistical tests to absolute weighted error . It uses processors and polylogarithmic time, where is the exponent of matrix multiplication.
By way of comparison, the algorithm of [24] would require roughly processors to construct a distribution of size . Our full results are more general, easily allowing for non-binary alphabets, or more complex automata types; see Theorem 5 for details. We emphasize that the algorithm is still typically more efficient even for applications with no particular Lipschitz bounds.
We illustrate this framework through two prototypical applications: the Gale-Berlekamp Switching Game and SDP rounding for approximate MAX-CUT.
Theorem 2 (Gale-Berlekamp Switching Game).
Given an matrix , there is a deterministic parallel algorithm using processors and polylogarithmic time to find satisfying
It is interesting that the Gale-Berlekamp analysis revolves around anti-concentration bounds, which are precisely the opposite of discrepancy-minimization. Our algorithm for this problem beats the complexity of the optimized algorithm of [24].
Theorem 3 (MAX-CUT Approximation).
Let be an arbitrary constant. Given an -vertex -edge graph , there is a deterministic parallel algorithm using processors and polylogarithmic time to find an approximate MAX-CUT of , where is the Goemans-Williamson approximation constant [21].
The SDP relaxation of MAX-CUT can be approximated in polylogarithmic time and near-linear work by [1], so we will be concerned with rounding the SDP solution. A rounding procedure was presented in [43], which however required a very large number of processors. We drastically improve it, getting closer to the sequential runtime of [9].
The complexity bounds in Theorems 2 and 3 do not depend on fast matrix multiplication; the runtimes are valid even using naive multiplication (). Although thy are still not fully work-efficient, this makes progress toward practical deterministic parallel algorithms.
In a broader perspective, our construction differ in various aspects from classical derandomization via (sequential) PRGs. While our automata framework does not provide a PRG, it seems better suited to efficient parallel derandomization. It is known that, by connections to Boolean circuits [11], a PRG for log space-bounded computation can be simulated in PRAM in polylog time with a polynomial number of processors. However, relying on the equivalence between PRAM and Boolean circuits leads to very high processors complexities (e.g., Nisan’s log space-bounded PRG uses at least processors [39]). Therefore, using known PRGs constructions would either require high polynomials (if logspace bounded) or not parallelize well. In addition to providing fast parallelization, our construction returns a distribution with seed length , which is line with known PRGs.
1.2 Technical approach
Our work follows the same general outline as previous algorithms for fooling automata, built from two main subroutines Reduce and Fool. The Fool algorithm (Section 5) recursively invokes Reduce (Section 4) to construct a fooling distribution, beginning with single-step automaton transitions and progressively scaling up to multi-step distributions. After levels, the final distribution effectively fools -step walks in the automata.
We introduce several novelties such as a work efficient Reduce subroutine and an analysis of Fool based on a different measure of error. Here, we attempt here to provide some high-level intuition on the technical content of this paper.
Distribution error tracking.
Prior work on automata fooling [24, 36] considered a notion of unweighted absolute error, where the goal was to fool all pairs of states. We replace this with an aggregated notion: the weight of a final state corresponds to the value of the associated final outcome (e.g., the final count for a counter automaton). The value of an intermediate state is then the expected weight of the final state under a random suffix. This is similar to a scheme used in the context of pseudorandom generators and branching programs [12, 15], where it is taken advantage of sub-maximal influence on the final expected value. The processor complexity is determined by the Lipschitz value of each state: how much the expected final weight changes with a single step of the automaton.
REDUCE subroutine.
This subroutine takes as input a distribution over automata walks (drivestreams) with a weight function ; it outputs a refined distribution with significantly smaller support while maintaining a weighted measure of closeness to . This leverages the connection between automata fooling and lattice approximation noted by [36] and applies the work-efficient lattice approximation of [19]. Our key approach to reduce the processor complexity is to sparsify – without materializing it – in steps.
In each step, we model the problem of rounding the bit of the elements in as a lattice-discrepancy problem, which is then solved by invoking [19]. This step resembles the PRG of INW [28]; there, one replaces the concatenation of two independent random walks with correlated random walks, using the derandomized square instead of the algorithm of [19].
Further optimizations.
In both of these applications, and many others, the relevant automata are counters which track the running sum of certain statistics. Instead of keeping track of these exactly, we truncate the counters to within a few standard deviations of their means. This reduces the state space of the automata by roughly a square-root factor. Doing this while also preserving relevant Lipschitz properties of the automata is technically challenging and requires significant analysis. This optimization will be discussed in Section 6.
The approximate MAX-CUT application (Section 7.2) requires a few additional problem-specific optimizations and arguments. In particular, to simulate the SDP rounding procedure efficiently, we (i) work with discretized truncated Gaussians and quantized counters, (ii) use a pessimistic estimator with a better Lipschitz constant, and (iii) we compute the transition matrices via FFT’s to exploit their convolution structure. We believe that these optimizations may generalize to other (SDP) rounding procedures and other settings as well.
2 Preliminaries
For an input of size , our goal is to develop deterministic PRAM algorithms with processor complexity and runtime. Unless stated otherwise, we assume that all relevant algorithms run in deterministic polylogarithmic time as a function of their processor and input complexities. For processor complexity and other parameters, we use notation: we say that if processors, where is the size of all algorithm inputs. Throughout, we use for logarithm in base and for base . We write for rounding to the nearest integer.
Throughout, proofs that are omitted (or sketched) appear in the full version.
2.1 Basic definitions for automata
The underlying probability space is defined by drawing a sequence , where each is independently drawn from a distribution over an arbitrary alphabet. We consider an automaton with a state space . At each timestep , the automaton in state receives an input and transitions to state .
For times , we define . In this context, we call the horizon and the pair as the window. We refer to a vector as a drivestream. We define to be the result of transiting from time to under . We write to indicate that each has non-zero probability in .
For a distribution on drivestreams, we denote the transition matrix as , i.e. is the probability of transiting from state to for a drivestream . For brevity, we also write and, for any weight function , we also define
Throughout, we define and where is the size of .
Our goal is to find a polynomial-size distribution on drivestreams that “fools” the automaton. That is, the behavior of the automaton given a drivestream should be similar to its behavior when given a drivestream . We will follow a strategy of [12] and measure error via a weight function over final states. It will be necessary to measure the “smoothness” of as a function of the drivestream values. Formally, we consider two, closely related, notions:
Definition 4 (Lipschitz value/confusion of a weight functions).
Let be a weight function for automaton .
-
The Lipschitz value at state and time is defined by
where the maximum is over which differ in only the coordinate.
-
The confusion at state and time is defined by
Colloquially, these are the maximum change to actual weight or the expected weight of the final state due to changing drivestream entry . Note that .
We define the total variability of a starting state as:
With this rather technical definition, we can state our algorithm concretely:
Theorem 5.
The algorithm Fool takes as input a parameter , and produces a distribution with
The distribution has size . The cost of Fool is processors, plus the cost of computing the expectations for states and times (more details on this step later.)
2.2 Lattice approximation
The problem of automata fooling is closely linked to lattice approximation, defined as follows:
Definition 6 (Lattice Approximation Problem (LAP)).
Given an matrix and fractional vector , the objective is to compute an integral vector to minimize the discrepancies .
Intuitively, this models the process of “rounding” each random bit (represented by ) to a static zero or one (represented by ). The work [19] provides a parallel algorithm with near-optimal discrepancy as well as complexity; we summarize it as follows:
Theorem 7 (Theorem 1.3 of [19]).
Suppose that for all . There is a deterministic parallel algorithm for the LAP with for all , where . The algorithm uses processors, where denotes the number of non-zero entries in the matrix .
It will be convenient to allow for the discrepancy matrix to take arbitrary real values.
Proposition 8.
Let for each row . There is a deterministic parallel algorithm for the LAP where for all , where . The algorithm uses processors.
Proof.
Construct a matrix , where for each row of , the matrix has and . Then apply Theorem 7 to .
3 Overview of algorithms and data structures
The subroutine Reduce (Section 4) is the technical core of the automata-fooling construction: given an input distribution over drivestreams and a weight function , it returns a distribution which is close to (measured in terms of ), but which has much smaller support. Importantly, through the use of appropriate data structures, the cost of Reduce may be significantly less than the total size of itself. Later, the final algorithm Fool (Section 5) will be defined by repeated calls to Reduce over successively larger time-horizons.
Concretely, we store each distribution over window as an array of drivestreams with associated probabilities . Here is the size of . By adding dummy zero entries, we can assume that is a power of two.
For a bitstring of length at most , we denote by the induced distribution consisting of all the drivestreams in whose indices start with . So , where and refer to concatenating a or bit to . Similarly, .
We define the Prediction Problem for a distribution and a weight function as follows: we need to produce a data structure , which can answer the following two types of queries: (i) given any bitstring , return the value ; (ii) given where is a bitstring and is a state, return the value . Each query should take polylogarithmic time and processors. In either case, can take on any length .
Observation 9.
Let be a distribution on window . The Prediction Problem for and a weight function can be solved with processors.
The most important case is for a Cartesian product, which will be used in Section 5. Critically, we can use the structure within a distribution to avoid materializing it explicitly.
Proposition 10.
Given distributions on windows and respectively, the Prediction Problem for distribution and any weight function can be solved with processors.
4 The REDUCE Algorithm
To reiterate, the goal of Reduce (Algorithm 1) is to take an input distribution and weight function , and produce a smaller distribution which is close to it. Note that here will not be the given weight function over final states .
For intuition, consider the following process. Draw elements randomly and independently with replacement from the support of the distribution , wherein is selected with probability proportional to . Then set , so the process is unbiased. Via standard concentration bounds, appropriate choices of ensure that . The algorithm Reduce is based on derandomizing this process via a slowed-down simulation: to compute , we iteratively compute distributions by fixing the bit level of each entry in .
In order to measure distribution error, we introduce the following key definition:
Definition 11 (Sensitivity).
For a function and state , the sensitivity is:
The following is our main result analyzing the algorithm:
Theorem 12.
The algorithm Reduce runs in processors, plus the cost of solving the Prediction Problem for . The final distribution is uniform with size . For large enough constant , the distribution satisfies
Proof.
Let us fix ; for brevity, we write for each state .
Each multiset contains bitstrings of length . Consider the distributions obtained by drawing a bitstring uniformly at random and then drawing drivestream from . In particular, and the distribution returned by Reduce is precisely . We have the following equation for every state :
Now, to analyze a given step , observe that is obtained by appending a bit to each bitstring . We expose the choice of the next bit in and as follows
Thus we can calculate the difference between probabilities for and as:
| (1) |
In light of Eq. (1), we apply Proposition 8 where each state corresponds to a constraint row with entries , and with for all . It is evident that, after solving the Prediction Problem for , the values for the Lattice Approximation Problem at Line 6 can be generated using processors. Furthermore, the maximum spread of the values over is at most , so . Since the matrix has rows and columns, Proposition 8 gives:
By our choice of , this is at most . Over all iterations, this gives the desired bound
5 The FOOL Algorithm
We build the automata-fooling distribution via the algorithm Fool (Algorithm 2).
The algorithm first finds the “expected value” vectors for each distribution . This may take advantage of problem-specific automaton properties, and it may have some small error. We will also describe a few “generic” methods to calculate these probabilities exactly.
The main loop fools distributions on time horizons in a bottom-up fashion, and merges them together using the Reduce procedure from the previous section. Specifically, at each level , it executes Reduce in parallel to combine and to obtain . Note that we never materialize the Cartesian product of and .
Theorem 13.
The cost of Fool is processors, plus the cost of computing the vectors . The final distribution has size .
Let us now proceed to analyze the error of the final distribution . For purposes of analysis, we define the exact transition vector and its approximation error by
Theorem 14.
For any state , the final distribution returned by Fool satisfies
Proof Sketch.
For any state and times , let be the maximum value of over . We show by induction on that each for any satisfies
| (2) |
The base case is vacuous since . The final case establishes the claimed result, since and .
As we have mentioned, there can be problem-specific shortcuts to compute (or approximate) the transition matrices and resulting expected-value vectors . There is the following generic method to compute them via matrix multiplication.
Proposition 15.
All vectors , can be computed with processors, where is the exponent of any efficiently-parallelizable matrix-multiplication algorithm.
In particular, for the Coppersmith-Winograd algorithm [16], we have . (See [29] for further details on parallel implementation.)
Moreover, if we can compute all transition matrices , then we can compute all vectors exactly (for any ) with processors.
Therefore, when dealing with multiple statistical tests, we have the following result.
Corollary 16.
Consider statistical tests , each computed by its own automaton on a state space of size , with its own weight function . When combined into a single automaton, the resulting automaton has the following properties:
-
1.
The total statespace is .
-
2.
Each starting state for automaton has , where denotes the value of for weight function and automaton .
-
3.
The exact vectors can be calculated in processors.
In particular, the Fool algorithm has processor complexity , and the resulting distribution has for each state of automaton .
6 Reducing the state space for counter automata
Given a weight function , let us consider a statistic of the form
We refer to such statistics as counters. They are ubiquitous in randomized algorithm; they are often used directly, and can also be combined together into higher-order statistics.
In the automata-fooling setting, it is easy to compute by tracking all possible values for the running sum. However, this is often overkill: typically, the running sum is confined to a much smaller window, roughly the standard deviation around the mean. We can take advantage of this by constructing an automaton which only maintains the running sum within “typical” values close to the mean, along with a “reject” state for some exponentially-rare deviations. Let us define some relevant parameters for the counter.
Given some error parameter , we choose a value with In this context, we refer to as the span of the automaton.
Observation 17.
For a drivestream drawn randomly from , it holds with probability at least that for all times .
Proof.
Apply Bernstein’s inequality and take a union bound over .
In light of Observation 17, our strategy will be to construct a truncated automaton , which only stores the running sum within a window of from the running mean. Define for each value . The truncated automaton has state space ; given input , it updates its state to a new state as:
Each integer state at time corresponds to a running sum . The state represents a “reject” or “rare” state where the running sum has deviated too far from its mean. We define a related potential function as
Intuitively, measures how close the current state is to a reject state. It can be used to “damp” the weight function and make a gradual transition to the reject state.
For purposes of analysis, it is useful to compare with the “truthful” automaton . The potential function can also be applied on the original state space , where we use the convention that any states corresponds to the reject state of .
Relevant properties of such as computing its transition matrix or its weight variability can be derived from up to a small error. Indeed, our application to Gale-Berlekamp Switching Game will use precisely this approach. For the MAX-CUT application, we will need to combine multiple truncated automata; this will require a slightly different damping function and analysis.
Proposition 18.
Let be a weight function for automaton and define
Let denote the transition matrices for automata respectively.
-
1.
For any time and state , there holds .
-
2.
For the starting state and time , there holds
-
3.
For any time and state , there holds
where denotes the confusion with respect to automata and weight function , while denotes the Lipschitz value with respect to automaton and weight function .
Note that, in both these cases, we use the convention that if , then corresponds to the reject state of automaton .
7 Applications
We present the derandomization of two basic problems: the Gale-Berlekamp Switching Game (Section 7.1) and approximate MAX-CUT via SDP rounding (Section 7.2). Through our new construction, we improve the deterministic processor complexity for both problems.
7.1 Gale-Berlekamp Switching Game
The Gale-Berlekamp Switching Game is a classic combinatorial problem: given an matrix with entries , we want to find vectors to maximize the imbalance An elegant randomized algorithm of [4], based on the Central Limit Theorem, gives imbalance of This was derandomized in [24], using automata-fooling with a processor complexity of .
Theorem 19.
There is a parallel deterministic algorithm to find with imbalance using processors.
In order to show Theorem 19, following [8] and [4], we set if , and otherwise. This gives
As shown in [13], for uniformly drawn from , we have for each . This is the statistical test we want to fool.
We will choose our drivestream values to be , with the uniform distrubition on For each value , we have a truncated counter automaton to track . This uses the construction of Section 6 with , where and and . This automaton uses the weight function
Proposition 20.
Fix a row and automaton , and correspondingly let be the full “truthful” automaton. Let be the transition matrices for respectively.
-
1.
For any time and state of , we have .
-
2.
The final state for satisfies .
-
3.
The weight function for automaton has total variability .
Proof.
We apply here Proposition 18 with . Clearly here and for all . Also, we calculate . With Proposition 18(a), this proves (a). Similarly, by Proposition 18(b), we have
where is the final state for the automaton .
Finally, for (c), we use the definition of total variability and Proposition 18(c) to get:
The complexity now follows from Corollary 16 with , where we have automata each of state space . By applying Proposition 20, we get that for any sequence drawn from the resulting distribution and automaton ,
Therefore, for any row , the final state satisfies
By searching the space exhaustively, we can find a specific sequence satisfying the desired imbalance bounds. This concludes the proof of Theorem 19.
7.2 Approximate MAX-CUT
The MAX-CUT problem for a graph is to find a vertex set to maximize the total weight of edges with exactly one endpoint in . The seminal work of Goemans & Williamson [21] showed that MAX-CUT can be approximated to a factor of by rounding its semi-definite programming (SDP) formulation. Moreover, the integrality gap of the SDP is precisely [18], and assuming the Unique Games Conjecture no better approximation ratio is possible for polynomial-time algorithms [34].
A sequential deterministic -approximation with runtime was shown in [9]. This relies on the method of conditional expectation, which is hard to parallelize. The work of [43] used the automata derandomization framework to get a parallel deterministic -approximation algorithm. The main downside of this approach is the huge polynomial processor complexity (on the order of ). We give an improved analysis with our new automata-fooling construction and some other optimizations.
To review, note that MAX-CUT can be formulated as the following integer program:
This can be relaxed to the following SDP:
where denotes the inner product in .
We can round a given SDP solution by drawing independent standard Gaussian random variables and constructing the cut , which gives cutsize
where OPT denotes the size of the maximum cut. Following [43], we will derandomize this by fooling each statistical test .
Theorem 21.
There is a deterministic parallel algorithm to find an -approximate rounding of a MAX-CUT SDP solution with processors.
Via the algorithm of [1] to solve the SDP, this gives the following main result:
Corollary 22.
For arbitrary constant , we can find an -approximate MAX-CUT solution using processors and polylogarithmic time.
The remainder of the section is devoted to showing Theorem 21. To avoid cumbersome calculations, we will show an algorithm for a -approximation, assuming is smaller than any needed constants. For readability, suppresses any terms.
As a starting point, consider the following process. We draw independent standard Gaussian variables . For each edge , we have a statistical test to read these values in order and compute and . We then apply the weight function as the indicator variable that .
In order to apply our derandomization framework efficiently, we make a number of changes to this basic strategy: (i) we define the drivestream value to be a discretized truncated Gaussian; (ii) we further quantize each term in the computation of the sum ; (iii) we apply the optimization of Section 6 to reduce the state space for each counter automaton; and (iv) we modify the weight function to a pessimistic estimator with better Lipschitz properties.Let us fix two quantization parameters for some constant
Concretely, each drivestream value is derived by truncating a standard Gaussian to within , and rounding to the nearest multiple of . Then, each term in the product is rounded to the nearest multiple of . Since this comes up frequently in the analysis, let us denote this “rounded” inner product by: Given this definition, we will form our cut set by setting .
Lemma 23.
For appropriately chosen , there is a coupling between random variables and such that, for any unit vector , there holds
Observe that the scaled sum is an integer-valued counter. We can apply the method of Section 6 to construct a “vertex” automaton that tracks the running sum within a window. We record a few parameters and observations about this automaton.
Proposition 24.
The automaton can be implemented with the following parameters:
From these vertex-automata, we construct an edge-automaton for each edge . Automaton keeps track of the joint states of , i.e. the truncated running sums and . It uses the following weight function for the state :
Note that whenever , we have , i.e. edge is cut.
Proposition 25.
For an edge , let , and consider the “truthful” automaton (which uses the full untruncated automata for vertices and ). Let denote the transition matrices for respectively.
-
1.
For any time and state of , there holds .
-
2.
The final automaton state for satisfies .
-
3.
For drivestreams differing in coordinate , the final states of automaton have .
-
4.
The weight function has total variability for automaton .
Proof.
The proof is very similar to Proposition 18; see full version.
Next, we discuss how to approximate the transition matrices for the automata. In light of Proposition 25(a), we will compute vectors for the full automaton . This is much easier than computing the transition matrices for , since we do not need to track reject states. In particular, this achieves error compared to .
Proposition 26.
For each automaton , the vectors , can be computed with processors.
Proof Sketch.
We will recursively compute transition matrices for and divisible by . The critical observation here is that since is a pair of counters, we only need to keep track of the probability distribution on the difference pairs
This is effectively a convolution, computable via an FFT.
We are now ready to compute the total complexity. Each probability distribution has size , thus . The reduced total number of states of an edge-automaton is and the weight function has total variability . So we run Fool with parameter , and the fooling process has cost . Consider the resulting distribution . The edge is only cut if . For each , by Theorem 14, the resulting final state satisfies
Here Proposition 25(a) gives , and Proposition 25(b) gives . Thus, overall, the edge is cut with prob. at least
Summing over all edges, the total expected weight of the cut edges is
The first term is at least . The second term is at most since . By searching exhaustively, we can find a cut satisfying these bounds.
References
- [1] Zeyuan Allen-Zhu, Yin Tat Lee, and Lorenzo Orecchia. Using optimization to obtain a width-independent, parallel, simpler, and faster positive SDP solver. In Proc. 27th annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1824–1831, 2016. doi:10.1137/1.9781611974331.ch127.
- [2] Noga Alon, László Babai, and Alon Itai. A fast and simple randomized parallel algorithm for the maximal independent set problem. Journal of Algorithms, 7(4):567–583, 1986. doi:10.1016/0196-6774(86)90019-2.
- [3] Noga Alon, Oded Goldreich, Johan Håstad, and René Peralta. Simple constructions of almost -wise independent random variables. Random Structures & Algorithms, 3(3):289–304, 1992. doi:10.1002/rsa.3240030308.
- [4] Noga Alon and Joel H Spencer. The probabilistic method. John Wiley & Sons, 2015. URL: http://cs.nyu.edu/cs/faculty/spencer/nogabook/nogabook.html.
- [5] Yossi Azar, Rajeev Motwani, and Joseph Naor. Approximating probability distributions using small sample spaces. Combinatorica, 18(2):151–171, 1998. doi:10.1007/PL00009813.
- [6] Nikhil Bansal, Nitish Korula, Viswanath Nagarajan, and Aravind Srinivasan. Solving packing integer programs via randomized rounding with alterations. Theory of Computing, 8(24):533–565, 2012. doi:10.4086/toc.2012.v008a024.
- [7] Mihir Bellare and John Rompel. Randomness-efficient oblivious sampling. In Proc. 35th annual Symposium on Foundations of Computer Science (FOCS), pages 276–287, 1994. doi:10.1109/SFCS.1994.365687.
- [8] Bonnie Berger. The fourth moment method. SIAM Journal on Computing, 26(4):1188–1207, 1997. doi:10.1137/S0097539792240005.
- [9] Ankur Bhargava and S Rao Kosaraju. Derandomization of dimensionality reduction and SDP based algorithms. In Workshop on Algorithms and Data Structures, pages 396–408. Springer, 2005. doi:10.1007/11534273_35.
- [10] Manuel Blum and Oded Goldreich. Towards a computational theory of statistical tests. In Proc. 33rd annual Symposium on Foundations of Computer Science (FOCS), pages 406–416, 1992. doi:10.1109/SFCS.1992.267749.
- [11] Allan Borodin. On relating time and space to size and depth. SIAM journal on computing, 6(4):733–744, 1977. doi:10.1137/0206054.
- [12] Mark Braverman, Anup Rao, Ran Raz, and Amir Yehudayoff. Pseudorandom generators for regular branching programs. SIAM Journal on Computing, 43(3):973–986, 2014. doi:10.1137/120875673.
- [13] Thomas A Brown and Joel H Spencer. Minimization of 1 matrices under line shifts. In Colloquium Mathematicum, volume 1, pages 165–171, 1971.
- [14] Suresh Chari, Pankaj Rohatgi, and Aravind Srinivasan. Improved algorithms via approximations of probability distributions. In Proc. 26th annual ACM Symposium on Theory of Computing (STOC), pages 584–592, 1994. doi:10.1145/195058.195411.
- [15] Lijie Chen, William M. Hoza, Xin Lyu, Avishay Tal, and Hongxun Wu. Weighted pseudorandom generators via inverse analysis of random walks and shortcutting. In Proc. 64th annual Symposium on Foundations of Computer Science (FOCS), pages 1224–1239, 2023. doi:10.1109/FOCS57990.2023.00072.
- [16] Don Coppersmith and Shmuel Winograd. Matrix multiplication via arithmetic progressions. In Proc. 19th annual ACM Symposium on Theory of Computing (STOC), pages 1–6, 1987. doi:10.1145/28395.28396.
- [17] Guy Even, Oded Goldreich, Michael Luby, Noam Nisan, and Boban Veličkovic. Approximations of general independent distributions. In Proc. 24th annual ACM Symposium on Theory of Computing (STOC), pages 10–16, 1992. doi:10.1145/129712.129714.
- [18] Uriel Feige and Gideon Schechtman. On the optimality of the random hyperplane rounding technique for MAX CUT. Random Structures & Algorithms, 20(3):403–440, 2002. doi:10.1002/rsa.10036.
- [19] Mohsen Ghaffari and Christoph Grunau. Work-efficient parallel derandomization II: Optimal concentrations via bootstrapping. In Proc. 56th annual ACM Symposium on Theory of Computing (STOC), pages 1889–1900, 2024. doi:10.1145/3618260.3649668.
- [20] Mohsen Ghaffari, Christoph Grunau, and Václav Rozhoň. Work-efficient parallel derandomization I: Chernoff-like concentrations via pairwise independence. In Proc. 64th annual Symposium on Foundations of Computer Science (FOCS), pages 1551–1562, 2023. doi:10.1109/FOCS57990.2023.00094.
- [21] Michel X Goemans and David P Williamson. Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming. Journal of the ACM (JACM), 42(6):1115–1145, 1995. doi:10.1145/227683.227684.
- [22] Parikshit Gopalan, Raghu Meka, Omer Reingold, Luca Trevisan, and Salil Vadhan. Better pseudorandom generators from milder pseudorandom restrictions. In Proc. 53rd annual Symposium on Foundations of Computer Science (FOCS), pages 120–129, 2012. doi:10.1109/FOCS.2012.77.
- [23] Parikshit Gopalan, Raghu Meka, Omer Reingold, and David Zuckerman. Pseudorandom generators for combinatorial shapes. In Proc. 43rd annual ACM Symposium on Theory of Computing (STOC), pages 253–262, 2011. doi:10.1145/1993636.1993671.
- [24] David G Harris. Deterministic parallel algorithms for bilinear objective functions. Algorithmica, 81:1288–1318, 2019. doi:10.1007/s00453-018-0471-0.
- [25] David G Harris. Deterministic algorithms for the Lovász local lemma: simpler, more general, and more parallel. Random Structures & Algorithms, 63(3):716–752, 2023. doi:10.1002/rsa.21152.
- [26] David G Harris, Ehab Morsy, Gopal Pandurangan, Peter Robinson, and Aravind Srinivasan. Efficient computation of sparse structures. Random Structures & Algorithms, 49(2):322–344, 2016. doi:10.1002/rsa.20653.
- [27] William M Hoza, Edward Pyne, and Salil Vadhan. Pseudorandom generators for unbounded-width permutation branching programs. In Proc. 12th Innovations in Theoretical Computer Science Conference (ITCS), volume 185, pages 7:1–7:20, 2021. doi:10.4230/LIPIcs.ITCS.2021.7.
- [28] Russell Impagliazzo, Noam Nisan, and Avi Wigderson. Pseudorandomness for network algorithms. In Proceedings of the twenty-sixth annual ACM symposium on Theory of computing, pages 356–364, 1994. doi:10.1145/195058.195190.
- [29] Joseph JáJá. An Introduction to Parallel Algorithms. Addison-Wesley Publishing Company, 1992.
- [30] Anatole Joffe. On a set of almost deterministic -independent random variables. the Annals of Probability, 2(1):161–162, 1974.
- [31] David R Karger and Daphne Koller. (De)randomized construction of small sample spaces in NC. Journal of Computer and System Sciences, 55(3):402–413, 1997. doi:10.1006/jcss.1997.1532.
- [32] Howard Karloff and Yishay Mansour. On construction of -wise independent random variables. In Proc. 26th annual ACM Symposium on Theory of Computing (STOC), pages 564–573, 1994. doi:10.1007/BF01196134.
- [33] Zander Kelley and Raghu Meka. Random restrictions and PRGs for PTFs in Gaussian space. In Proc. 37th Computational Complexity Conference (CCC), 2022. doi:10.4230/LIPIcs.CCC.2022.21.
- [34] Subhash Khot, Guy Kindler, Elchanan Mossel, and Ryan O’Donnell. Optimal inapproximability results for MAX-CUT and other 2-variable CSPs? SIAM Journal on Computing, 37(1):319–357, 2007. doi:10.1137/S0097539705447372.
- [35] Michael Luby. A simple parallel algorithm for the maximal independent set problem. In Proc. 17th annual ACM Symposium on Theory of Computing (STOC), pages 1–10, 1985. doi:10.1145/22145.22146.
- [36] Sanjeev Mahajan, Edgar A Ramos, and KV Subrahmanyam. Solving some discrepancy problems in NC. Algorithmica, 29(3):371–395, 2001. doi:10.1007/s004530010046.
- [37] Raghu Meka, Omer Reingold, and Avishay Tal. Pseudorandom generators for width-3 branching programs. In Proc. 51st annual ACM Symposium on Theory of Computing (STOC), pages 626–637, 2019. doi:10.1145/3313276.3316319.
- [38] Joseph Naor and Moni Naor. Small-bias probability spaces: Efficient constructions and applications. In Proc. 22nd annual ACM Symposium on Theory of Computing (STOC), pages 213–223, 1990. doi:10.1145/100216.100244.
- [39] Noam Nisan. Pseudorandom generators for space-bounded computations. In Proc. 22nd annual ACM Symposium on Theory of Computing (STOC), pages 204–212, 1990. doi:10.1145/100216.100242.
- [40] Noam Nisan. RL SC. In Proc. 24th annual ACM Symposium on Theory of Computing (STOC), pages 619–623, 1992. doi:10.1145/129712.129772.
- [41] Ryan O’Donnell, Rocco A Servedio, and Li-Yang Tan. Fooling Gaussian PTFs via local hyperconcentration. In Proc. 52nd annual ACM Symposium on Theory of Computing (STOC), pages 1170–1183, 2020. doi:10.1145/3357713.3384281.
- [42] Jeanette P Schmidt, Alan Siegel, and Aravind Srinivasan. Chernoff–Hoeffding bounds for applications with limited independence. SIAM Journal on Discrete Mathematics, 8(2):223–250, 1995. doi:10.1137/S089548019223872X.
- [43] D Sivakumar. Algorithmic derandomization via complexity theory. In Proc. 34th annual ACM Symposium on Theory of Computing (STOC), pages 619–626, 2002. doi:10.1145/509907.509996.
- [44] Aravind Srinivasan. New approaches to covering and packing problems. In Proc. 12th annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 567–576, 2001. URL: http://dl.acm.org/citation.cfm?id=365411.365535.
