Abstract 1 Introduction References

Online Condensing of Unpredictable Sources via Random Walks

Dean Doron ORCID Ben-Gurion University of the Negev, Beer-Sheva, Israel Dana Moshkovitz ORCID University of Texas at Austin, TX, USA Justin Oh ORCID University of Haifa, Israel David Zuckerman ORCID University of Texas at Austin, TX, USA
Abstract

A natural model of a source of randomness consists of a long stream of symbols X=X1Xt, with some guarantee on the entropy of Xi conditioned on the outcome of the prefix x1,,xi1. We study unpredictable sources, a generalization of the almost Chor–Goldreich (CG) sources considered in [9]. In an unpredictable source X, for a typical draw of xX, for most i-s, the element xi has a low probability of occurring given x1,,xi1. Such a model relaxes the often unrealistic assumption of a CG source that for every i, and every x1,,xi1, the next symbol Xi has sufficiently large entropy. Unpredictable sources subsume all previously considered notions of almost CG sources, including notions that [9] failed to analyze, and including those that are equivalent to general sources with high min entropy.

For a lossless expander G=(V,E) with m=log|V|, we consider a random walk V0,V1,,Vt on G using unpredictable instructions that have sufficient entropy with respect to m. Our main theorem is that for almost all the steps t/2it in the walk, the vertex Vi is close to a distribution with min-entropy at least mO(1).

As a result, we obtain seeded online condensers with constant entropy gap, and seedless (deterministic) condensers outputting a constant fraction of the entropy. In particular, our condensers run in space comparable to the output entropy, as opposed to the size of the stream, and even when the length t of the stream is not known ahead of time. As another corollary, we obtain a new extractor based on expander random walks handling lower entropy than the classic expander based construction relying on spectral techniques [11].

As our main technical tool, we provide a novel analysis covering a key case of adversarial random walks on lossless expanders that [9] fails to address. As part of the analysis, we provide a “chain rule for vertex probabilities”. The standard chain rule states that for every xX and i, Pr(x1,,xi)=Pr[Xi=xi|X[1,i1]=x1,,xi1]Pr(x1,,xi1). If W(x1,,xi) is the vertex reached using x1,,xi, then the chain rule for vertex probabilities essentially states that the same phenomena occurs for a typical x:

Pr[Vi=W(x1,,xi)]Pr[Xi=xi|X[1,i1]=x1,,xi1]Pr[Vi1=W(x1,,xi1)],

where Vi is the vertex distribution of the random walk at step i using X.

Keywords and phrases:
Randomness Extractors, Expander Graphs
Funding:
Dana Moshkovitz: Supported in part by NSF Grant CCF-2312573 and CCF-2200956.
Justin Oh: Supported by NSF Grant CCF-2312573. This work was done while a student at the University of Texas as Austin.
David Zuckerman: Supported in part by NSF Grant CCF-2312573 and a Simons Investigator Award (#409864).
Copyright and License:
[Uncaptioned image] © Dean Doron, Dana Moshkovitz, Justin Oh, and David Zuckerman; licensed under Creative Commons License CC-BY 4.0
2012 ACM Subject Classification:
Theory of computation Expander graphs and randomness extractors
Related Version:
Full Version: https://eccc.weizmann.ac.il/report/2024/165/ [10]
Editors:
Srikanth Srinivasan

1 Introduction

Randomness is an extremely useful and ubiquitous tool in computer science. Algorithms, protocols and reductions often assume access to uniformly distributed bits. An inherent question is what kind of randomness we can reasonably obtain from nature (or engineering), and whether we can make such randomness as useful as uniform bits. This has spawned a long line of research with many deep and interesting results.

Random walks and their analysis are also an essential tool in computer science, as well as in mathematics and physics. It is natural, therefore, to ask

How do random walks behave when the instructions for each step are not truly uniform and independent?

Such a scenario occurs when the instructions come from a weak source of randomness. A common assumption about a weak source is that overall, it has min-entropy.111We say X{0,1}n has min-entropy k, H(X)k, if maxxXPr[X=x]2k. We call such an X a k-source and k/n the entropy rate. The hope is that the quality of the vertex distribution of the random walk is much better, and more useful, than that of the original source.

Gillman’s Chernoff bound for random walks on expanders [11] implies that most nodes on an expander random walk are close to uniform for any source with entropy rate very close to 1 (see, e.g., [21]). However, random walks cannot mix for general rate 1/2 sources, since an adversary that controls half the steps can have the even numbered steps undo the odd numbered steps. It’s therefore interesting to ask if any structure in the source can enable random walks to mix for lower entropy rates. The first paper to address this question was [9], who showed that for certain low-rate sources random walks do mix well.

Successful analyses of such random walks give clean constructions of extractors and condensers, which purify the randomness in a weak source. Specifically, an extractor is a function 𝖤𝗑𝗍:{0,1}n×{0,1}{0,1}m that uses an independent and uniform -bit seed Y to convert X into a distribution 𝖤𝗑𝗍(X,Y) that is statistically close to uniform. A condenser is slightly weaker: It’s a function 𝖢𝗈𝗇𝖽:{0,1}n×{0,1}{0,1}m that converts X into a distribution with high entropy rate. If 𝖢𝗈𝗇𝖽(X,Y) is close to a k-source, then a successful condensing means that k/mk/n. The -bit string in both cases is called the seed. In certain cases, seedless (deterministic) extraction and condensing, where =0, is also possible.

A natural family of weak sources is one where X is a long stream of short symbols, X=X1Xt{0,1}dt=n, with each symbol being revealed one at a time. Indeed, historically, some of the first definitions of weak sources [14, 5] were streaming models. Similarly, a very natural question is how well a random walk mixes when one uses the stream of short symbols as instructions. The streaming model of randomness also corresponds to common sources of randomness in practice. Probably the most popular sources of entropy involve the exact timing of interrupts from mouse movements, keyboard strokes, disk I/O, receiving network packets, and other unpredictable events. Other sources include thermal noise and repeatedly looking at the last few digits of a clock timed according to an independent clock.

To model such streaming sources, one needs some property that implies that each Xi{0,1}d has some entropy, even conditioned on the previously observed x1,,xi1 (we often abbreviate this as x[1,i1]). Commonly studied notions such as sequences of independent sources, Santha–Vazirani (SV) sources [14], and Chor–Goldreich (CG) sources [5] indeed have these properties. A δ-CG source is a sequence of random blocks X=X1Xt, each Xi{0,1}d, such that for any i and any prefix a{0,1}d(i1), it holds that H(Xi|X[1,i1]=a)δd. A previous work [9] shows that random walks using CG sources can in fact mix, and obtains excellent deterministic condensers as a result.

One distinct advantage of random walk based extractors and condensers is that they are readily online. That is, it is not necessary to know ahead of time how long the stream X1,,Xt is, and nevertheless, the procedure can utilize most of the total entropy k within, by processing each symbol of the stream sequentially in a read-once fashion, and in space comparable to the amount of entropy m we need, as opposed to the length of the stream t. We emphasize that the notion of online extracting and condensing goes hand in hand with streaming models of randomness such as CG sources, which in turn goes hand in hand with random walks. Moreover, aside from a few works [7, 8, 9], online constructions for various types of sources are scarce.

Unfortunately, the assumptions of a CG source are quite strong, and may be unrealistic in practice. In a CG source, no matter the outcome of the previous symbols x1,,xi, it must be the case that the next symbol Xi has high entropy. In some sense, such a definition asserts that the randomness stream can contain no “errors” of a certain type. Although [9] effectively analyzes certain kinds of errors, for others, it completely fails.

Our work continues the line of inquiry in [9] by generalizing the notion of CG sources and asking whether online condensing of randomness streams is possible even in the presence of errors. We give a novel analysis of random walks using a stream of symbols, which may be of independent interest, showing that mixing is possible even in the presence of a very general notion of errors. Thus, we give online extractors and condensers for a more general and practical class of sources. The class of randomness streams we consider are what we call unpredictable sources.

Definition 1 (unpredictable source, simplified).

We say that X=X1Xt, each Xi{0,1}d, is a (δ,ρ) (simplified) unpredictable source, if, for every i, with probability at least 1ρ over σX, it holds that Pr[Xi=σi|X[1,i1]=σ[1,i1]]2δd.

In words, in an unpredictable source, for every i, with high probability over a sample σX, the next symbol σi is unlikely, conditioned on its prefix. This notion forgoes the often demanding assumption of CG sources: It is no longer true that Xi has high entropy regardless of the outcome of the previous symbols. (For another simplified instantiation of unpredictable sources, see Definition 6.)

We think of ρ as the error parameter of the source. Intuitively, it represents the probability of seeing a low-entropy step. As we discuss later on, the analysis of [9] completely fails on unpredictable sources, and so online condensing of such sources is unknown. Our results even hold for a more general notion of unpredictable source, where only the average “error” over i needs to be small.

Definition 2 (unpredictable source).

We say that X=X1Xt, each Xi{0,1}d, is a (δ,ρ) unpredictable source, if, when defining

ρi(X,δ)=PrxX[Pr[Xi=xiX[1,i1]=x[1,i1]]>2δd],

it holds that 𝔼i[t][ρi(X,δ)]ρ.

We present and study this definition for two main reasons. The first is that it seems natural. In particular, it elegantly generalizes all previous notions of almost CG source considered in [9] (see Definition 6). The second is that, perhaps surprisingly, this notion is weak enough to capture arbitrary high-entropy sources. Indeed, not only are unpredictable sources close to having high min-entropy, but the converse is true as well: Arbitrary (1ρ)n-sources are (1/2,2ρ) unpredictable sources! (see [10, Proposition 6.4], the full version of this extended abstract). Since (1ρ)n sources cannot be condensed (without seed) beyond their entropy rate (see e.g. [10, Claim A.1]), the following observation is immediate:

Observation 3.

For δ1/2, there is no deterministic condenser for (δ,ρ) unpredictable sources to entropy rate >(1ρ/2).

Nevertheless, we give an elegant construction of a seeded condenser for such sources beyond this entropy rate, and a deterministic condenser for such sources close to this rate.

The crux of our result is an analysis that shows that random walks using unpredictable sources can “mix” sufficiently well.

Theorem 1 (main (informal)).

(See [10, Theorem 3.14, Corollary 3.16]) Let δ,ρ>0 and D=2d>1 be constants. Let X=X1Xt, Xi{0,1}d, be a (δ,ρ) unpredictable source. Let G be a sufficiently good D-regular lossless expander on an appropriately chosen number of vertices M=2m.222See Section 1.3 for a discussion about lossless expanders. M=M(d,t,δ,ρ) is chosen to be up to roughly 2k, for k being the (smooth) min-entropy of X.

Suppose that the Xi-s are used as instructions for a random walk on G from an arbitrary starting vertex. Given xX, let Wi(x) denote the vertex reached in the i-th step using x as instructions, and let Zi(x)=logPrxX[Wi(X)=Wi(x)]. Then,

Pri[t/2,t],xX[Zi(x)<mO(1)]O(ρ)

In words, for most steps in the second half of the random walk, most of the vertices reached are unlikely. Thus, for a random i, Wi(X) is close to an extremely high entropy distribution, namely one with constant entropy gap! We emphasize again that the analysis requires a new technique, since the analysis in [9] fails to give anything useful for unpredictable sources. Moreover, this works for low entropy rate, unlike the high-entropy result that follows from Gillman’s Chernoff bound for random walks. Constructions of online condensers and extractors follow as corollaries to this analysis, and we discuss them next.

1.1 Online Extracting and Condensing

Due to the streaming nature of our source X, one would like a condenser (or extractor) for such sources to process each symbol sequentially as it is received. The notion of online condensing achieves exactly this. As in the model from Dodis, Guo, Stephens-Davidowitz, and Xie [7, 8], in (deterministic) online condensing, the function 𝖢𝗈𝗇𝖽 is implemented by a procedure that starts in a state S0, and makes a sequence of calls to an update procedure

Si+1𝖴𝗉𝖽𝖺𝗍𝖾(Si,Xi+1).

The length of each state Si should be not much larger than the final output length m. The procedure may then output the final state St{0,1}m (or perhaps some function of St).

We clarify that the question of whether online condensing is possible is entirely orthogonal to the question of whether seeded condensing is possible. Indeed, a natural question to ask is whether (and how well) can general weak sources be condensed in an online manner. In the case of seeded online condensing (and in some regimes, we provably must use a seed), one may consider an update procedure that also takes as input a seed Y{0,1} that is independent of the stream X, and computes Si+1𝖴𝗉𝖽𝖺𝗍𝖾(Si,Xi+1,Y).333 Since the seed length typically depends on t, when the length of the stream t is not known in advance, one can model the use of a uniform seed by also viewing it as a stream of uniform and independent bits. For example, the seed can be initialized to the empty string (or some constant length uniform string), and 𝖢𝗈𝗇𝖽 may call, in conjunction with each update, an additional procedure, Y𝖤𝗑𝗍𝖾𝗇𝖽𝖲𝖾𝖾𝖽(Si,Y), that may choose to increase the length of Y by one or several bits, depending on the current state Si. Generally, the choice to extend Y will depend on how many symbols 𝖢𝗈𝗇𝖽 has seen so far, which would be stored in the state. The guarantee of such a 𝖢𝗈𝗇𝖽 should be that at any point in the stream i, the state Si should contain most of the entropy seen so far in X1,,Xi (or about |Si|, if this length is smaller), and the length of Y relative to i should be small. The hope is that the update procedure accumulates the additional entropy from Xi in each step into the state Si, and thus the final state St contains most of the entropy in X (or about the length of |St|, if this length is smaller). As noted earlier, the advantages of such an online model of condensing are that it allows one to utilize the entropy that was overall contained in the entire stream X, even if one does not know the length of the stream t ahead of time. Moreover, when one only wishes to get m bits of entropy out of a very long stream of length tm. An online construction would use space m rather than t.

1.2 Online Condensing via Random Walks

In [9], they showed that a natural way to condense a randomness stream is to use its symbols as instructions for a random walk over an expander G, starting from an arbitrary fixed vertex (similarly to Theorem 1). The intuition is that if a step in a random walk makes progress towards mixing, then that step accumulates the entropy from the instruction into the vertex distribution. Thus, using the current vertex in the walk as our “state” yields a (deterministic) condenser with output length m=logM, where M is the number of vertices of G.444One may notice that committing to a graph of size M may be problematic when the length of the stream is not known ahead of time, as one would like m to be comparable to t. This can be fixed with a trick of repeatedly increasing the size of the graph at regular intervals. We discuss this more in Section 1.5.1, and for most of the introduction, we will assume that t is known ahead of time.

In this work, we are primarily focused on sequences X1,,Xt that may not mix at every step, as is the case for unpredictable sources. We now give broad intuition on how such erroneous steps affect condensing. Suppose that a symbol Xi is highly correlated with the previous instructions x1,,xi1. Also, assume that the vertex distribution at step i1 is uniform on some set S of size K. If G is a D-regular graph, then an adversarially chosen Xi may cause the walk to “consolidate” the vertices of S into groups of size D. This would result in a vertex distribution that is uniform on a set of size K/D, and hence d=logD bits of entropy were lost. In general, one can show that this is the worst that can happen, and so if there are very few bad steps overall, then overwhelmingly, mixing, and thus condensing, indeed occurs. Realizing this intuition for unpredictable sources poses several challenges, that we discuss further in Section 1.5.

1.3 CG-Sources, Lossless Expanders, and the DMOZ Condenser

Towards discussing unpredictable sources, let us first review in more detail the previous work, [9], on condensers via random walks. Both here, and in [9], we study random walks on lossless expanders. A degree-D (K,ε)-lossless expander on M vertices is an undirected graph G such that for any set SV,|S|K, the size of the neighborhood Γ(S) satisfies |Γ(S)|(1ε)D|S|.555More technically, we consider bipartite graphs ([M],[M],E), as these are what the known explicit constructions yield. However, at a high level, this distinction is not necessary. [9] proved that a random walk using X, starting from an arbitrary fixed vertex, on a sufficiently good lossless expander, accumulates entropy and thus yields a deterministic condenser.

Theorem 4 ([9], informal).

Let δ,η be constants. Suppose that for every M=2m, there exists an explicitly computable D=2d-regular (K,ε)-lossless expander on M vertices, with εD(1δ), and K=M/poly(D). Then, for any positive integer t, there exists an explicit function

𝖢𝗈𝗇𝖽:{0,1}n=dt{0,1}m=Ω(δdt)

such that given a δ-CG source X, 𝖢𝗈𝗇𝖽(X) is η-close to an mO(log(1/η))-source. Moreover, 𝖢𝗈𝗇𝖽 can be computed in an online manner.666[9], also handled almost δ-CG sources. In this definition, instead of each Xi being a δd-source for every prefix, each Xi is only γ-close to being a δd-source.

That is, 𝖢𝗈𝗇𝖽 condenses X to within a constant entropy gap, where we say that the entropy gap (with error ε) of a distribution Z{0,1}m is Δ if Z is ε-close to an (mΔ)-source.

But in practice, it may be unreasonable to assert that for every i and every prefix x1,,xi, the next symbol Xi is highly unpredictable. To address this, [9] does consider generalized versions of CG sources, although the guarantee about 𝖢𝗈𝗇𝖽(X) from Theorem 4 for such sources is much weaker there. In particular, the situation (before this work) becomes quite bleak when introducing the generalization coined ρ-error in [9].

Definition 5 (ρ-almost CG source).

A ρ-almost δ-CG source is a sequence of random variables X=X1Xt with Xi{0,1}d, such that for each i[t], we have that for at least probability 1ρ over the prefix a{0,1}d(i1)X1Xi1, it holds that H(Xi|X[1,i1]=a)δd.

It turns out that in the presence of ρ-error (together with other error types discussed shortly), general min-entropy sources are almost CG sources in some regime of parameters (see [9, Section 8]). Thus, in general, deterministic condensing to within a constant entropy gap of such sources is impossible. Moreover, before this work, it was unknown whether or not a random walk using ρ-almost CG sources mix well in any sense at all. In this paper we show that it is indeed the case that random walks mix, even under the more general notion of unpredictable sources, which captures all previously considered generalizations of CG sources.

1.4 Unpredictable Sources

As discussed previously, the notion of CG sources is quite strong – it assumes that every prefix leads to a high entropy distribution is quite strong. An unpredictable source does not make such an assumption. Indeed, notice that in the definition of an unpredictable source, we do not directly insist on any guarantee on the (smooth) min-entropy of the distribution Xi|X[1,i1]=x[1,i1], only that usually, the next symbol xi is unlikely conditioned on x[1,i1]. We give unpredictable sources their name as it closely resembles the intermediate objects of the same name that show up in pseudorandom constructions such as reconstructive extractors [18, 15, 17] (for the precise definition of reconstructive extractors, see, e.g., [16]).

When talking about (δ,ρ) unpredictable sources, we informally refer to δ as the “entropy rate” of the unpredictable source, and ρ as its “error rate”.777An (δ,ρ) unpredictable source is indeed, roughly, ρ-close to an δn sources. It is easy to see that this definition captures almost-CG sources with all previously considered error parameters.

Definition 6 (almost CG source with all error parameters).

We say that X=X1Xt, each Xi{0,1}d, is a (δ,γ,ρ,λ) almost CG source, if, for at least (1λ) fraction of i[t], the following holds:

PrxX[Hγ(Xi|{X[1,i1]=x[1,i1]})<δd]ρ.

Indeed, a (δ,γ,ρ,λ) almost CG source is a (δ,γ+ρ+λ) unpredictable source.888It is also true that the converse holds via several averaging arguments, although with a large loss in parameters. We prefer to study and phrase our results for unpredictable sources, as the statements are clean, and with minimal artifacts of analysis. Additionally, as was the case for almost CG sources with all error parameters, every general source with (1ρ)n min entropy is an unpredictable source, with a much more straightforward argument, with fewer constraints on ρ, and with less loss in converting ρ into the error parameters of an almost CG source (see [10, Proposition 6.4]). In this work, we give the following result for unpredictable sources, analogous to Theorem 4:999Theorem 4 and 2 are analogous in the sense that they both yield a condenser with constant entropy gap for their respective class of sources. There is a key difference in that Theorem 4 is seedless, whereas 2 is seeded. This is necessary. As we have discussed before, general weak sources are unpredictable sources.

Theorem 2 (seeded condensing (informal); see Corollary 3.16 in [10]).

Let δ,ρ be constants, and let D and ε be constants that satisfy ε<D(1δ). Suppose that for every M, there exists an explicitly computable D=2d-regular (K,ε)-lossless expander on Θ(M) vertices, with K=M/poly(D). Then, for any positive integer t, there exists an explicit function

𝖢𝗈𝗇𝖽:{0,1}n=dt×{0,1}=logt1{0,1}m=Ω(δdt)

such that given a (δ,ρ) unpredictable source X and an independent and uniform Y, 𝖢𝗈𝗇𝖽(X,Y) is O(1δ(εD1δ+ρ))-close to an mO(log(1/ε)) source. Moreover, 𝖢𝗈𝗇𝖽 can be computed in an online manner.101010See [10, Appendix B] for a more detailed description of the online version.

As suggested by prior discussion, the construction is again to simply use X as instructions for a random walk on a lossless expander, with the random seed indicating the stopping time. In order to show that such a walk mixes, we develop a new analysis, different than that of [9], which we discuss in detail in Section 1.5.

Let us briefly discuss the error term in the theorem’s statement. First, roughly speaking, the term εD1δ is the probability that a set S “does not expand” in some sense. For example, εD1δ is the probability over a uniformly chosen vertex vS and uniform neighbor Γ(v), that Γ(v) has another neighbor in S. Broadly speaking, events such as this are undesirable: they represent “collisions” of paths in the random walk. Thus we expect such an error term, as it corresponds to the (inherent) error of the expander. We should also expect the error rate ρ to appear for the same reason: this corresponds to the probability that “expansion does not occur” due to low quality randomness (as opposed to the expander’s error). Indeed, if one considers the (1,ρ) unpredictable distribution X that is 0n with probability ρ, and uniform otherwise, one can see that at every step of a random walk using X, some vertex will have probability mass at least ρ.

Finally, we comment on the relationship between these parameters. In general, as ρ is an error probability over the entire space of X, it is possible for it to be sub-constant. While ρ can be subconstant, the constant-ρ regime is more interesting, and our theorem handles the case when ρ is in fact fairly large, for example ρDδ. Thus, the error term can be thought of as O(ρ/δ), and in fact it is necessary that ρ<δ. Intuitively, using a (δ,ρ) unpredictable source, in a typical run of the random walk, one expects there to be roughly t steps each accumulating δd bits of entropy (for a total of δdt entropy gained), and roughly ρt steps that lose d bits of entropy (for a total of ρdt entropy lost). Thus overall, one should not expect anything good to happen when the entropy rate of the unpredictable source is smaller than the error rate.

We can also show that even without a random stopping time, we can use our new analysis of random walks to get a result about deterministic condensing, although, as expected, the entropy gap is not constant.

Theorem 3 (seedless condensing (informal); see Corollary 3.17 of [10]).

Let δ,ρ be constants and let D and ε be constants that satisfy ε<D(1δ). Suppose that for every M, there exists an explicitly computable D=2d-regular (K,ε)-lossless expander on Θ(M) vertices, with K=M/poly(D). Then, for any positive integer t, there exists an explicit function

𝖢𝗈𝗇𝖽:{0,1}n=dt{0,1}m=Ω(δdt)

such that given a (δ,ρ) unpredictable source X, 𝖢𝗈𝗇𝖽(X) is O(εD1δ+ρ)-close to a (1β)m-source, where β=O(1δεD1δ+ρ). Moreover, 𝖢𝗈𝗇𝖽 can be computed in an online manner.

Overall, Theorem 2 and Theorem 3 indicate that it is indeed possible to condense a very general class of sources in an online manner.

On Extracting from Unpredictable Sources

As a final note for this section, recall that we cannot hope to extract from arbitrary unpredictable sources without a seed. However, even if one only cares about extracting, rather than condensing, from unpredictable sources, our work is the first to do so in an online manner: Indeed, one can use Theorem 2 to within constant entropy gap, and then apply a known online construction for constant entropy gap. We stress that known constructions for arbitrary weak sources with linear entropy rate, such as [20, 21], are not online, and thus are unsatisfying for streams of randomness.

1.4.1 A Two-Stage Construction, and Recent Developments in Lossless Expanders

An expert reader may notice that the statements of Theorem 4, Theorem 2, and Theorem 3 are slightly weaker than what is actually achievable. In each of these theorems, we require explicit expanders with εD(1δ). In other words, as the entropy rate δ of the source gets smaller, the error of the expander that we use must improve. Optimal, non-explicit expanders (as well as random ones) can achieve a dependence of ε1/D and would thus allow us to handle any constant entropy rate δ. However, the [2] explicit construction only achieves εD1/6, and more recent works can improve this to εD1/2 [6, 12]. Thus, even considering recent improvements, Theorem 4, Theorem 2, and Theorem 3 can only support entropy rate δ>1/2.

Fortunately, explicit optimal constructions are not necessary. A trick from [9], the two-stage construction, utilizes constant-sized optimal expanders (found by brute force) to condense small blocks of the stream X1,,Xt into a higher entropy rate δ>δ larger blocks, that is high enough to use the known (suboptimal) explicit constructions. For details of the construction for unpredictable sources, see [10, Section 4].

Nevertheless, we choose to present our results and phrase our theorems assuming optimal expanders for several reasons. The first is that the parameters are better, aesthetically simpler, and easier to analyze when no two stage construction is required. Moreover, we wish to highlight that the novelty of this work is the analysis of random walks on a single expander, without the trick of the two-stage construction. Finally, because of the two-stage construction, the current lack of better constructions of explicit expanders is not an inherent barrier to the plausibility of explicit condensing for smaller δ.

We give instantiations of Theorem 2, and Theorem 3 for any entropy rate δ>0 in [10, Theorem 5.3 and Theorem 5.5] using currently known explicit expanders and the two-stage construction. As the statement is relevant for the next discussion, we give an informal version of the latter here, which considers deterministic condensing.

Theorem 4 (informal; see Theorem 5.5 of [10]).

Let δ>0 be any constant, let dpoly(1/δ), and ρpoly(δ). Then, for any positive integer t, there exists an explicit function

𝖢𝗈𝗇𝖽:{0,1}n=dt{0,1}m

with m=Ω(δdt) such that for any (δ,ρ) unpredictable source X=X1Xt with each Xi{0,1}d, 𝖢𝗈𝗇𝖽(X) is ρ1/C close to a (1β)m, source, where βρ1/C, for some universal constant C.

1.4.2 Perspective: Condensing from Unpredictable Sources vs. General Sources

Having presented our main results about seeded condensing to within constant entropy gap, and deterministic condensing outputting a constant fraction of the entropy, we provide a few observations about the nature of unpredictable sources.

First, we know that general sources are unpredictable sources in the high entropy regime (see [10, Proposition 6.4]). Indeed, if H(X)(1ρ)n, then X is already a (δ=0.99,100ρ) unpredictable source. Thus, essentially for any δ, nontrivial condensing requires seed, and we provide a simple such condenser that is even online.

However, this does not imply that for every δ, an unpredictable source is as hard to deal with as a general source of the same entropy rate. Indeed, Theorem 4 shows that for small entropy rate δ, deterministic condensing is possible to with output entropy rate roughly 1ρ1/C. Thus, one can deterministically condense unpredictable sources from entropy rate 0.01 to entropy rate 0.99. Such a feat is not possible for general sources! Indeed, if X is a general source with entropy rate, say, 0.6, a simple argument shows that it cannot be deterministically condensed to entropy rate, say, 0.7 (see [10, Claim A.1]). This suggests an interesting property about unpredictable sources: Deterministic condensing “past the entropy rate” of such sources is easy, while the hard part is condensing “past the error rate.” In particular, although our analysis only achieves an output entropy rate of 1ρ1/C, we suspect that deterministic condensing to an entropy rate of 1O(ρ) is possible. Moreover, there is good reason to believe that there is a barrier to condensing past this error rate: When considering general (1ρ)n sources, the entropy gap ρ becomes the error rate when thinking of it as an unpredictable source.

We end the discussion by leaving as an open line of inquiry to determine the exact threshold of the output entropy rate of deterministic condensers.

1.5 Technical Overview: Random Walks Using Unpredictable Sources

We are now ready to present a technical overview of how we analyze random walks via unpredictable sources. Since unpredictable sources subsume CG sources, we’ll start with discussing the challenges inherent to both sources, and the previous solution for CG sources. An initial observation is that for both types of sources, spectral analysis fails, and for two main reasons. The first one is that spectral expanders may not be lossless, and even the best spectral expanders may only be (K,ε=1/2)-lossless. Such expansion is insufficient for us, as it intuitively means that for a distribution on a set of vertices S, at least half of all edges leaving S may lead to collisions (with perhaps D other nodes in S). Since, as discussed before, collisions imply a loss in entropy, even the good high entropy steps fail to mix, unless the steps were almost uniform.

The second reason is that random walks using CG sources and unpredictable sources are non-Markovian: The distribution of the next step depends on the entire history of the walk up until that point. Therefore, we cannot analyze the evolution of the vertex distribution at each step by repeatedly applying a transition matrix and bounding the norm of the corresponding probability vector. Moreover, the distribution of the next instruction Xi+1, given a prefix x1,,xi, can be adversarial. That is, whatever the vertex distribution pi may be for each i, Xi+1 could be the worst possible edge distribution that yields the least amount of improvement for pi+1 (while still satisfying the overall conditions on the source X).

Nevertheless, [9] provides a direct analysis that shows that the norm of the vertex distribution does evolve favorably over time. Specifically, they show that for the q-norm, setting q=1+α, if pi is the vertex distribution of the random walk at step i, then pi+1qq1Dδαpiqq. More concretely they prove:

Theorem 7 (informal; see [9], Theorem 5).

Let G=(U=[M],V=[M],E) be a sufficiently good lossless expander, and let q=1+α for some sufficiently small constant α.

Fix any i, and let ru, for each uSupp(pi), be a distribution over {0,1}d[D], each being a δd source. For any uU and vV let ru(u,v) denote the probability that the edge leading from u to v is chosen under ru. By definition, pi+1 is defined as pi+1(v)=uΓ(v)ru(u,v)pi(u). Then,

pi+1qqO(1Dδα)piqq,

as long as piqq is not already smaller than 1/Kα.

This essentially implies that the entropy of the vertex distribution increases by at least δd. Thus, inductively, the final distribution will have ptqq1Kα, implying that it has min entropy roughly k.

For ease of exposition, for the remainder of the section, we consider unpredictable sources where for every i, ρi=ρ for some constant ρ. This case captures most of the intuition and difficulty at a high level.

The 𝒒-norm analysis fails

Unfortunately, in the case of unpredictable sources, the q-norm analysis cannot give a good bound on the norm of the final vertex distribution, ptq. To see this, consider a distribution X that is 0n with probability ρ, and is a δ-CG source otherwise. This is both a CG source with ρ-error and a (δ,ρ)-unpredictable source. However, the norm of the final vertex distribution will always be at least ptqρ. Thus, the q-norm will not help us to establish that the final (smoothed) min entropy is large. But note that it is true in this example, where clearly we are ρ-close to having high min-entropy. It has been an open question since [9] to give an analysis that shows this is always the case.

Beating the union bound, once again

Naively, one might try to fix the q-norm analysis as follows. For each i, condition on the event that Xi+1 has “high entropy” given x1,,xi. The original q-norm analysis could then work on this conditional distribution, and since the probability that this event does not happen is at most ρ, we can conclude that in the i-th step, the q-norm decreases “except with error ρ.” Unfortunately, it is not clear how to chain such an argument multiple times over all steps t, without using a union bound which would require ρ<1/t.

We remark that originally, in the case of CG sources, [9] uses the q-norm analysis in part to beat the union bound over the expander error ε. As we’ve seen from the discussion above, the q-norm analysis does not allow you to do the same for ρ. Thus, beating the union bound over ρ is yet another challenge to overcome.

Probability evolution, not distribution evolution: a “chain rule” for vertex probabilities

The issue with the approaches above is that they attempt to make a statement about the quality of the vertex distribution at every step i. As discussed, it is not clear how to make any such statement. This leads us to search for an alternative approach. Denote by W(x1,,xi) as the vertex reached when taking the instructions x1,,xi. In an unpredictable source, given a typical xX, one expects to see roughly (1ρ)t “good” steps i in which

Pr[Xi=xi|X[1,,i1]=x[1,,i1]]Dδ,

and ρt “bad” steps in which Pr[Xi=xi|X[1,,i1]=x[1,,i1]]>Dδ. One would like to argue that this directly translates to good steps and bad steps in the random walk. In other words, for every good step,

pi(W(x1,,xi))1Dδpi(W(x1,,xi1)),

and for every bad step, pi(W(x1,,xi))Dpi(W(x1,,xi1)).

Notice that such an approach does not directly make a statement about the distribution at each step: we do not claim that for each i, the overall entropy of pi increases. Rather, we say that individually, each path of vertices that the random walk takes is on its own journey of ups and and downs in individual probability, and typically there are few downs. We are able to make this approach concrete with the following key lemma.

Theorem 5 (chain rule for vertex probabilities; see Theorem 3.6, Corollary 3.7 of [10]).

Let G be a D-biregular (K,ε) lossless expander. Let X=X1Xt, each Xi{0,1}d, and fix some 0<δ1. Then, for any i[t], there is a subset Si{0,1}n=dt with Pr[XSi]14εD1δ2ρ, such that for every xSi,

pi(W(x1,,xi))max(2Dδpi1(W(x1,,xi1)),DO(log1/ε)K).

We believe that this chain rule for vertex probabilities is interesting in its own right. The standard chain rule for probability states that for every xSupp(X), and every i the probability of x1,,xi decreases from the probability of x1,,xi1 by a factor of Pr[Xi=xi|X[1,i1]=x1,i1]. The chain rule for vertex probabilities states that the same evolution of probabilities occurs when considering W(x1,,xi) and W(x1,,xi1), as long as the conditional probability of xi+1 “has entropy” (as is needed for expansion), and accounting for the probability of a collision due to the inherent error of the expander or the probability the next step has no entropy.111111An expert reader might ask how the 5 compares to the standard statement about lossless expanders as lossless conductors. In the standard case, when δ=1 and ρ=0, the property of lossless conductors states that if pi1 is a source with min entropy k<k=logK, then pi is ε-close to a k+d-source. 5 on the other hand, requires no assumption on the entropy of the input source pi1, and is still able to conclude that the distribution “improves” in one step.

Analyzing a full random walk

So far, we’ve shown that at every step, there is a high probability over xX that the corresponding vertex probabilities decrease. Notice we have made no assertion yet about how drastically the vertex probability might increase when the event Si does not occur. However, it is not too hard to show that it is extremely unlikely for the probability to increase drastically. Overall, we can argue that in expectation over xX, there are roughly (1ρ)t steps for which the vertex probability goes down by a factor of roughly Dδ, more accurately, pi(W(x1,,xi)2Dδpi1(W(x1,,xi1)), and the total factor increase from the remaining ρt steps, is roughly Dρt.

This argument so far is essentially all we need to obtain Theorem 3: When ρ is small, for a typical x, the number of good steps is overwhelmingly large comparing to the number of bad steps, and therefore one expects most runs of the random walk to end up at a vertex that has probability at most pt(W(x1,,xt))D(δtρt).

Using a random stopping time

To obtain a seeded condenser with constant entropy gap, as in Theorem 2, we must characterize a bit more accurately how a typical run of the random walk behaves. In reality, Theorem 5 states that if i is a good step for x1,,xt (that is, xSi), then the probability of the vertex reached at step i decreases by 1D as long as the vertex probability has not already reached the “capacity” DO(log1/ε)/K, which is a constant factor smaller than 1/M, for M being the number of vertices in the expander.121212In general, for constant degree lossless expanders, K=M/poly(D). If we can prove that over a random xX, and a random stopping time i, that the vertex probability is at capacity with high probability, then we have proven Theorem 2.

Suppose we choose M to be noticeably less than Dδt, say, D(δ/2)t. Then, we expect that in a typical run of the random walk, the vertex probability reaches the capacity (or is close to it) after t/2 steps. We can assume for simplicity that the vertex probability is exactly at capacity after t/2 steps. Now, let us consider what happens in the last t/2 steps, under this assumption. There are only ρt<t/2 steps for which the vertex probability can increase, each of which increases it by a factor of roughly D. Thus, most of the other t/2 steps either keep the vertex probability at capacity, or “repairs” a deficit from capacity by a factor of 1/Dδ. Overall, this means that over a random stopping time in the last t/2 steps, the probability of not being at capacity (meaning the walk has recently taken one of the ρt bad steps, or one of the (ρ/δ)t “repairing” good steps), is roughly ρ+ρ/δ=O(ρ/δ).

1.5.1 Making Our Condensers Fully Online

A random walks based condenser is online if each symbol Xi is processed and used to update the state sequentially in a “read-once” fashion. However, there is an issue when the length of the stream t is not known ahead of time. Indeed, if one must settle on a graph of size M ahead of time, then one cannot hope for a final output entropy larger than m. This is problematic if the length of the stream t is not known ahead of time, and ends up being much larger than m, as we will miss out on most of the entropy of the stream. Broadly speaking, the workaround to this issue is as follows: If we know how much entropy we expect overall in each X1,,Xi, then we can repeatedly increase the size of the graph at regular intervals, to accommodate the additional entropy expected. This allows us to maintain the guarantee that the entropy of the vertex distribution is close to m for all (or most) steps.

In a bit more detail, for simplicity, assume that the total length of the stream is a power of two (although still unknown). We begin the random walk from a fixed vertex on a small D-regular graph of size 2C for some constant C. If we see more than C symbols from the stream, we embed the current vertex into a D-regular graph of size 22C and walk for another C steps. If a node in the smaller graph is represented by v{0,1}C, then one can embed it in the larger graph as v0C. Such an embedding provides a one-to-one mapping from the vertex distribution in the small graph to one with the same entropy in the large graph. We repeat this embed-and-walk process until the stream ends. This idea essentially suffices to implement the deterministic condenser of Theorem 3 in an online fashion.

To implement the condenser from Theorem 2, we use the same embed-and-walk process, but we must take care to implement the random stopping time in an online fashion as well. Once again, for simplicity, assume that t is a power of two. We once again begin the walk on a constant sized graph of size 2C, and initialize a seed of length c=logC to pick a random stopping time in case tc. When the random stopping time is reached, we save the resulting vertex additionally in the state, and we continue the random walk until time c. If the stream ends at time c, output the saved vertex. Otherwise, we embed the current vertex into a graph of size 22C, and use 𝖤𝗑𝗍𝖾𝗇𝖽𝖲𝖾𝖾𝖽 to add one more bit to the seed. Now, Y represents a random stopping time between 1 and 2c, and we can repeat this process until the stream ends. Ultimately, this shows that for every i[t] that is a power of two, the distribution obtained from using X1,,Xi as a random walk (with a random stopping time) contains most of the entropy k seen so far, within a graph whose size M is not too much larger than K=2k, all the while only needing to generate logi bits of seed.

As the details of such an online implementation are mostly minor alterations to the main results of our work, we defer a more detailed explanation to [10, Appendix B].

1.6 Improved Random Walk-Based Extractors for High Min-Entropy Sources

So far, the main takeaway from our results is that lossless expanders can handle unpredictable sources with low entropy rate. On the other hand, spectral expanders can handle unpredictable sources with high entropy rate, as can be seen by looking at the classic random walks based extractor.

However, even for the case of sources with high entropy, the lossless expander random walk yields a quantitatively better result. In the classic expander random walk extractor (or sampler), based on the expander Chernoff bound, in order to achieve an error of ρ in the output distribution, it is necessary for the entropy of the input source X to be at least (1ρ2/C)n for some constant C.

Theorem 8 (standard RW-based extractor).

There exists a universal constant C such that the following holds. For every positive integer n, and any ρ>0, there exists an explicit (k,ρ) extractor

𝖤𝗑𝗍:{0,1}n×{0,1}=lognO(1){0,1}m=Ω(k)

for any k(1ρ2/C)n+log(1/ρ).

The ρ2 factor is inherent in the use of the expander Chernoff bound. On the other hand, if one uses our new lossless expander random walk to condense to constant entropy gap (and then apply known constructions of extractors for sources with constant entropy gap with short seed length), one only needs the input source to have entropy (1ρ/C)n for some constant C in order to obtain final output error ρ. In addition, all of this can be implemented in a fully online manner, even when n is not known ahead of time.

Theorem 6 (new RW-based extractor; see Theorem 6.6 of [10]).

There exist universal constants ρ0(0,1) and C>1 such that the following holds. For every positive integer n, and any constant ρ(0,ρ0), there exists an explicit (k,ρ) extractor

𝖤𝗑𝗍:{0,1}n×{0,1}=logn+O(log(1/ρ)){0,1}m=Ω(k),

for any k(1ρ/C)n.

1.7 Related Work

Before introducing ρ-almost CG sources, [9] first generalizes CG sources by introducing what is coined λ-error.

Definition 9 (λ-almost CG source).

A λ-almost δ-CG source is a sequence of random variables X=X1Xt with Xi{0,1}d, such that for at least (1λ)t of i[t], we have that for any prefix a{0,1}d(i1), it holds that H(Xi|X[1,i1]=a)δd.

Unlike ρ-error, for λ-error, [9] is still able to construct condensers by running a random walk using X, although not with a constant entropy gap. Instead, the gap is roughly λm, for reasons inherent to the random walk construction itself. Intuitively, for the λ fraction of bad indices i, Xi could be completely determined (and adversarially chosen) based on xi1. Therefore, whatever edge xi1 instructs the walk to take, xi could instruct to return via the same edge, effectively wiping out the progress made from xi1. Overall, when all the bad indices are at the end, it can wipe out λt steps of entropy accumulation, leaving an entropy gap of λt.131313When λ>0 but the λ-fraction of bad blocks is nicely distributed in the sense that each suffix contains at most λ-fraction of bad blocks (up to an additive term), we can regain constant entropy gap. See [9, Section 3.1], where this property is called suffix friendliness. The case of λ-error is interesting in its own right. In fact, a recent work of Chattopadhyay, Gurumukhani, and Ringach [4], shows that deterministic condensing of λ-almost δ-CG sources is impossible, even with large entropy gap, in the regime where λ12.

Goodman, Li, and Zuckerman [13] showed how to condense CG sources even when the blocks are long, and the entropy rate is subconstant. However, their constructions are not online, and they don’t address the case of almost CG sources.141414However, their constructions work for suffix-friendly CG sources.

Previous works that directly consider (deterministic) online extraction [7, 8] assume a strong notion of unpredictability, wherein the Xi-s are independent (but with some min-entropy). In their model, they assume that for every i, |Si|=|Xi|=n, with the length of the stream t sufficiently long that the total entropy k of X is at least n. Recall that in our work, we generally think of each Xi{0,1}d for some constant d, n=dt, and |St|kn. More specifically, [7] considers how entropy accumulates for specific update functions that are based off of practical random number generation. They show that entropy accumulates when the Xi-s are independent draws from certain classes of distributions known as 2-monotone distributions. [8], considers linear update functions and shows that entropy accumulates when the Xi-s are independent k-sources.

Other previously studied notions of sequential sources include Somewhere Honest Entropy Look Ahead (SHELA) sources [1], also known as online Non-Oblivious Symbol Fixing (oNOSF) sources [4]. Such sources are essentially the λ-error CG sources discussed above, except for two distinctions. The first is that the good steps are all high entropy distributions that are independent from each other. The second is that each bad step only depends on previous blocks.151515The latter property is why those sources are called “online” – an adversary corrupts the bad blocks while only knowing the history. They do not give online condensers for such sources. Aggarwal et a. [1] shows that extracting from oNOSF sources is impossible, however one can convert oNOSF sources into uniform oNOSF sources. Chattopadhyay, Gurumukhani, and Ringach explored the limits of online condensing of such sources [4], and later achieved constructions with essentially optimal parameters [3].

A recent work by Xun and Zuckerman [19] provides constructions of strong offline extractors whose seed length has nearly optimal dependence on n and ε: for any desired α>0, their construction gives an extractor with seed length (1+α)log(nk)+(2+α)log1/ε+O(1), as long as the entropy rate k/n is sufficiently close to 1 (depending on α). To compare, our results discussed in Section 1.6 provide online extractors with seed length logn+O(log1/ε)+O(1) when k/n1Θ(ε).

References

  • [1] Divesh Aggarwal, Maciej Obremski, João Ribeiro, Luisa Siniscalchi, and Ivan Visconti. How to extract useful randomness from unreliable sources. In Advances in Cryptology–EUROCRYPT 2020: 39th Annual International Conference on the Theory and Applications of Cryptographic Techniques, pages 343–372. Springer, 2020. doi:10.1007/978-3-030-45721-1_13.
  • [2] Michael Capalbo, Omer Reingold, Salil Vadhan, and Avi Wigderson. Randomness conductors and constant-degree lossless expanders. In Proceedings of the 34th Annual Symposium on Theory of Computing (STOC), pages 659–668. ACM, 2002. doi:10.1145/509907.510003.
  • [3] Eshan Chattopadhyay, Mohit Gurumukhani, and Noam Ringach. Condensing against online adversaries. In Electronic Colloquium on Computational Complexity (ECCC), 2024. URL: https://eccc.weizmann.ac.il/report/2024/171.
  • [4] Eshan Chattopadhyay, Mohit Gurumukhani, and Noam Ringach. On the existence of seedless condensers: Exploring the terrain. In Proceedings of the 65th Annual Symposium on Foundations of Computer Science (FOCS), pages 1451–1469. IEEE, 2024. doi:10.1109/FOCS61266.2024.00093.
  • [5] Benny Chor and Oded Goldreich. Unbiased bits from sources of weak randomness and probabilistic communication complexity. SIAM Journal on Computing, 17(2):230–261, 1988. doi:10.1137/0217015.
  • [6] Itay Cohen, Roy Roth, and Amnon Ta-Shma. HDX condensers. In Proceedings of the 64th Annual Symposium on Foundations of Computer Science (FOCS), pages 1649–1664. IEEE, 2023. doi:10.1109/FOCS57990.2023.00100.
  • [7] Yevgeniy Dodis, Siyao Guo, Noah Stephens-Davidowitz, and Zhiye Xie. No time to hash: On super-efficient entropy accumulation. In Advances in Cryptology–CRYPTO 2021: 41st Annual International Cryptology Conference, pages 548–576. Springer, 2021. doi:10.1007/978-3-030-84259-8_19.
  • [8] Yevgeniy Dodis, Siyao Guo, Noah Stephens-Davidowitz, and Zhiye Xie. Online linear extractors for independent sources. In Proceedings of the 2nd Conference on Information-Theoretic Cryptography (ITC), pages 14:1–14:14. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2021. doi:10.4230/LIPICS.ITC.2021.14.
  • [9] Dean Doron, Dana Moshkovitz, Justin Oh, and David Zuckerman. Almost Chor–Goldreich sources and adversarial random walks. In Proceedings of the 55th Annual Symposium on Theory of Computing (STOC), pages 1–9. ACM, 2023. doi:10.1145/3564246.3585134.
  • [10] Dean Doron, Dana Moshkovitz, Justin Oh, and David Zuckerman. Online condensing of unpredictable sources via random walks. Electron. Colloquium Comput. Complex., TR24-165, 2024. URL: https://eccc.weizmann.ac.il/report/2024/165.
  • [11] David Gillman. A Chernoff bound for random walks on expander graphs. SIAM Journal on Computing, 27(4):1203–1220, 1998. doi:10.1137/S0097539794268765.
  • [12] Louis Golowich. New explicit constant-degree lossless expanders. In Proceedings of the 35th Annual Symposium on Discrete Algorithms (SODA), pages 4963–4971. ACM-SIAM, 2024. doi:10.1137/1.9781611977912.177.
  • [13] Jesse Goodman, Xin Li, and David Zuckerman. Improved condensers for Chor-Goldreich sources. In Proceedings of the 65th Annual Symposium on Foundations of Computer Science (FOCS), pages 1513–1549. IEEE, 2024. doi:10.1109/FOCS61266.2024.00096.
  • [14] Miklos Santha and Umesh V. Vazirani. Generating quasi-random sequences from semi-random sources. Journal of Computer and System Sciences, 33(1):75–87, 1986. doi:10.1016/0022-0000(86)90044-9.
  • [15] Ronen Shaltiel and Christopher Umans. Simple extractors for all min-entropies and a new pseudorandom generator. Journal of the ACM, 52(2):172–216, 2005. doi:10.1145/1059513.1059516.
  • [16] Amnon Ta-Shma and Christopher Umans. Better lossless condensers through derandomized curve samplers. In Proceedings of the 47th Annual Symposium on Foundations of Computer Science (FOCS), pages 177–186. IEEE, 2006. doi:10.1109/FOCS.2006.18.
  • [17] Amnon Ta-Shma, David Zuckerman, and Shmuel Safra. Extractors from Reed–Muller codes. Journal of Computer and System Sciences, 72:786–812, 2006. doi:10.1016/J.JCSS.2005.05.010.
  • [18] Luca Trevisan. Extractors and pseudorandom generators. Journal of the ACM, 48(4):860–879, 2001. doi:10.1145/502090.502099.
  • [19] Zhiyang Xun and David Zuckerman. Near-optimal averaging samplers. Electron. Colloquium Comput. Complex., TR24-097, 2024. URL: https://eccc.weizmann.ac.il/report/2024/097.
  • [20] David Zuckerman. Randomness-optimal oblivious sampling. Random Structures and Algorithms, 11(4):345–367, 1997. doi:10.1002/(SICI)1098-2418(199712)11:4\%3C345::AID-RSA4\%3E3.0.CO;2-Z.
  • [21] David Zuckerman. Linear degree extractors and the inapproximability of Max Clique and Chromatic Number. Theory of Computing, 3:103–128, 2007. doi:10.4086/TOC.2007.V003A006.