Bit-Fixing Extractors for Almost-Logarithmic Entropy
Abstract
An oblivious bit-fixing source is a distribution over , where bits are uniform and independent and the rest are fixed a priori to some constant value. Extracting (close to) true randomness from an oblivious bit-fixing source has been studied since the 1980s, with applications in cryptography and complexity theory.
We construct explicit extractors for oblivious bit-fixing source that support , outputting almost all the entropy with low error. The previous state-of-the-art construction that outputs many bits is due to Rao [Rao, CCC ’09], and requires entropy for some large constant . The two key components in our constructions are new low-error affine condensers for poly-logarithmic entropies (that we achieve using techniques from the nonmalleable extractors literature), and a dual use of linear condensers for OBF sources.
Keywords and phrases:
Seedless extractors, oblivious bit-fixing sourcesCategory:
RANDOMCopyright and License:
2012 ACM Subject Classification:
Theory of computation Expander graphs and randomness extractorsAcknowledgements:
We thank Jesse Goodman for many helpful and interesting conversations, and Mahdi Cheraghchi for helpful discussion on the cryptographic applications of OBF extractors.Funding:
The work is supported in part by NSF-BSF grant #2022644.Editors:
Alina Ene and Eshan ChattopadhyaySeries and Publisher:
Leibniz International Proceedings in Informatics, Schloss Dagstuhl – Leibniz-Zentrum für Informatik
1 Introduction
Whenever randomness is used in computations–whether because it is necessary or just because it is faster and simpler in practice, access to unbiased bits is assumed. Because sources of perfect randomness are notoriously hard to come by, ad hoc, very weak sources of randomness are used instead. It is essential, therefore, that the crude randomness generated by such sources be purified, thus driving the development of the beautiful theory of randomness extractors. This problem of extracting randomness from imperfect sources can be traced back to von Neumann [54]. A randomness extractor is a deterministic procedure that converts a weak random source into a random source that is close to uniform.
Definition 1 (seedless extractor).
Let be a family of distributions over . We say that is an extractor for with error , if for every , is -close, in total variation distance, to , the uniform distribution over bits.
One common assumption is that each weak source has min-entropy.111We say that has min-entropy , , if . We call such an source. Unfortunately, if all we assume is an entropy guarantee, it is easy to show that such an does not exist. However, seedless extraction is possible for some restricted classes of sources,222To extract from general sources, one can use an additional short uniform seed, driving the rich theory of seeded extractors. See Definition 8 for the definition. and indeed, studying the capabilities and limitations of extraction from natural families of structured sources has been a fruitful endeavor for the past four decades (see [27, Section 1.3] for a comprehensive up-to-date survey of seedless extraction results).
One of the simplest and most natural families of weak sources of randomness is oblivious bit fixing sources, wherein bits are uniform and independent, while the rest bits are fixed.
Definition 2 (OBF source).
A distribution is an oblivious bit-fixing source if there exists a subset of size of “good indices”, such that the -s are uniform and independent, and the rest are fixed.333To clarify the terminology, the word “oblivious” refers to the fact that the fixed bits are chosen before the random bits are tossed. On the other hand, in nonoblivious bit fixing sources, the non-uniform bits can depend arbitrarily on the values of the good ones (see Section 3.4).
In addition to simulating true randomness, extractors have been found to be useful, even essential, in myriad applications, and extractors for oblivious bit-fixing sources are no exception.
In cryptography, especially in exposure resilient cryptography (see, e.g., [21]), there are several related primitives for which OBF extractors are useful. (Cryptographic) resilient functions (and almost perfect resilient functions) are essentially OBF extractors, possibly under a different choice of error measure [6, 53]. Exposure resilient functions are a weaker variant, wherein the security is guaranteed only on average over the uniform bits [9]. Both primitives are useful for wiretap protocols, where two parties wish to agree on a random string in the presence of an adversary that can learn some of the transmitted bits. Another related notion is that of all-or-nothing transforms, used for protection of block ciphers [49, 32, 21, 22] (see also [15], and the discussion in [15, Appendix A]). OBF extractors were also used for distributed generation of correlated randomness [23]. Outside of cryptography, OBF extractors have also found applications in complexity theory, mainly for lower bounds [35, 14].
Previous constructions
Extractors for OBF sources were first studied by Chor, Goldreich, Håstad, Freidmann, Rudich, and Smolensky [17] in the zero-error setting (indeed, this is one of the rare examples in which outputting exactly uniform bits is possible). They observed that the simple XOR function outputs a uniform bit for any , but also proved that is necessary even for the case. More generally, via establishing connections to error correcting codes, Chor et al. showed that when , we can (explicitly) extract bits, but cannot extract more than, roughly, bits. Further lower bounds in this regime were obtained by Friedman [24].
What if we allow error, and wish to support lower entropies? Kamp and Zuckerman were the first to go below linear entropy, achieving and , with error . The entropy loss in the [34] construction was later improved by Gabizon, Raz, and Shaltiel [26], who achieved and exponentially-small error. Gabizon et al. were also able to reduce the entropy down to , but with a worse error of . But often, especially in cryptographic applications, a large error is prohibitive. The first low-error OBF extractor for poly-logarithmic entropy was constructed by Rao [47], outputting bits with error . We will further discuss Rao’s construction in Section 2.1.
For even lower entropies, recent explicit constructions that support near-logarithmic entropy already work for the more general family of affine sources.444An affine source is a distribution that is flat over some affine subspace of of dimension . Thus, an OBF source is simply an affine source whose basis consists of only elementary vectors. Using techniques from the nonmalleable extractors literature, Chattopadhyay, Goodman, and Liao [10] achieved , and this was later improved by a recent work of Li [44] that gets . Unfortunately, the latter two constructions work for constant error and one output bit, which is not relevant for OBF extractors. However, one can think of applying techniques similar to the ones in [50, 42]. But these, to the best of our knowledge, do not readily give a significant improvement on either or .
Lastly, it is interesting to note that while a straightforward application of the probabilistic method gives a non-explicit construction for (roughly) and , logarithmic entropy is not a lower bound. In [34], they gave a construction that outputs only bits, has error , but works for any . In [20], Cohen and Shinkar gave a lower bound for extraction, and showed that when is a small enough function of , output bits are all we can hope for. Moreover, they gave a “semi-explicit” construction, when the error is large enough, for sub-logarithmic entropy. Their construction supports , outputs bits, and runs in time .
Our result
We give an explicit low-error construction for that outputs almost all the entropy.
Theorem 3 (see also Theorem 24).
There exist constants and such that the following holds for any and . There exists an explicit function
that is a extractor for OBF sources, where .
While the error guarantee does not meet the optimal (or even ), both our construction and [47] achieve starting from . Importantly, our construction is the first to extract almost all the entropy with vanishing error starting from . In particular, our error guarantee surpasses a -type dependence for all ranges of .
The low-error challenge
The recent decade has seen exciting progress in extractors that support small entropies, most notably in constructions of two-source extractors555A two-source extractor for entropy extracts from the family of sources in which each comprises two independent sources , each satisfies . ([13, 4, 19, 43, 44]), and the use of similar techniques to construct affine extractors [42, 10, 44]. While these extractors, which generally follow the celebrated Chattopadhyay–Zuckerman framework [13], work for low entropies, their error is relatively high for reasons which will be discussed in Section 2.2. Low-error constructions are crucial for the security of cryptographic applications, as well as for good correlation bounds in lower bounds applications, but despite numerous attempts, recent constructions do not obtain negligible error in polynomial time.666For two-source extractors, the best low-error constructions require for a small constant [8, 36], whereas the [13] construction gets with , and followup constructions for even lower entropies only work for constant error. For affine extractors, the best low-error constructions require [55, 37, 45], whereas, again, recent constructions in the near-logarithmic regime only work for constant error. It is worth noting that if one allows three independent sources, then efficient low-error three-source extractors exist already for poly-logarithmic min-entropy [40, 3].
Our construction uses key components of the [13] framework (mainly towards constructing new low-error affine condensers, see Section 2.2), and is able to outperform the error parameter in most related constructions. We thus view this work also as a proof-of-concept for utilizing more recent machinery to tackle low-error constructions.
2 Proof Overview
Our construction combines two new components: A low-error affine condenser with a small gap, and a dual use of linear condensers for OBF sources. In Section 2.1, we will revisit Rao’s construction [47] and his use of linear condensers for OBF sources. In Section 2.2, we will discuss the [13] framework, the low-error adaptation of [3] to two-source condensers, and our construction of affine condensers. In Section 2.3, we will give the full construction, after discussing our use of linear condensers for OBF sources.
2.1 Rao’s Construction
Given an OBF source , Rao’s construction [47] goes roughly as follows.777The [47] construction works not only for OBF sources, but more generally to low-weight affine sources. Our result captures low-weight affine sources as well, and we discuss this in Section 5.3.
-
1.
Transform into , where is a linear transformation, is much shorter than , yet it preserves the entropy of . Thus, this step can be seen as applying a linear condenser for OBF sources, and we discuss this step a bit more thoroughly below. Note that is an affine source.
-
2.
Transform into an affine somewhere-random source. This is a source distributed over a table, where one row is uniform, and every other row depends on the uniform one in an affine way. has few rows, , but the length of each row is almost . This transformation is done via applying a (linear) seeded extractor on and every possible seed, i.e., .888More accurately, to make each row longer than , Rao injects more entropy to the table by performing an additional extraction step, this time with itself as the source, and as the seed. This is the part that requires to have poly-logarithmic entropy.
-
3.
Next, extract from . This step is done via repeatedly applying an affine somewhere-random condenser, that halves the number of rows in the table, while only shortening each row by a little bit.
Our construction makes use of Step (1) (in fact, more than once, see Section 2.3), and completely dispenses with Step (3). In our construction, once we have our table from Step (2), we condense it into a single string using a different mechanism. This is encapsulated under a new affine condenser, which we discuss soon in Section 2.2.
Linear Condensers for OBF Sources
In [47], Rao observed that parity check matrices of binary error correcting codes are good linear condensers for OBF sources, and low-weight affine sources in general.999It is interesting to note that relying on the distance property of error correcting codes alone cannot give optimal linear OBF condensers. One can show that there exist condensers with output length . Specifically, let be the parity check matrix of a linear code with co-dimension and distance (see Section 3.5 for the relevant definitions). Then, given an OBF source , the affine source still has entropy (see Lemma 22 for the easy proof).
This (lossless) condensing allows Rao, and us, to apply affine primitives on sources of much shorter lengths. Indeed, working with BCH codes, one can get .101010[47] uses an error correcting code that comes from small-bias sets, and achieves a worse distance-to-codimension tradeoff. In fact, for our construction, BCH codes are necessary. In our construction, we will also use a lossy instantiation of these linear OBF condensers, where . We discuss this further in Section 2.3.
2.2 Low-Error Affine Condensers
The [13] Framework
We start by briefly describing the [13] framework, originally geared towards obtaining an extractor for two independent sources. The CZ construction uses two main ingredients: A “correlation breaking” primitive,111111The idea of constructing extractors via correlation breaking appeared prior to [13], notably in [38, 39, 41, 18]. and a resilient function. For the former, we will consider here a -correlation breaker with advice,121212The actual definition offers much more flexibility than what we describe here, but it suffices for this discussion. Also, the original CZ construction uses non-malleable extractors as the correlation breaking primitive, but there are known reductions between the two objects. which is a function that takes inputs from a weak source , a uniform seed , and a fixed advice string , and the guarantee is that
where are independent of , and the -s are all different from . Roughly speaking, for a typical , is close to uniform even given the value of on other adversarially chosen -s. Hence, if we were to build a table with rows, and put, say, in the -th row, then rows of good seeds were not only close to uniform, but they’re also close to being -wise independent.
Yet, there are no one-source extractors, and [13] need to use a second weak source to sub-sample from the seeds of . Specifically, take a seeded extractor ,131313See Definition 8 for the formal definition. and consider the table with rows, in which each row is given by
| (1) |
An analogous way to view the construction of , which would be more beneficial towards our construction, is to first create the table with rows wherein the -th row is simply . While most of the rows in that table are marginally close to uniform, they are arbitrarily correlated. We then use with an independent source to (partially) break correlations.141414This viewpoint was taken, e.g., in [18, 42, 5].
It turns out that itself, when the parameters are set correctly, is close to a table in which “good” rows are truly -wise independent, and there are “bad” rows that can depend arbitrarily on the good ones. Indeed, many constructions following [13] can also be divided into two steps, where the first one transforms and into with the above structure, known as a non-oblivious bit-fixing source, and the second one, which we now discuss, is where resilient functions comes into play.
Concisely, a resilient function is an extractor for non-oblivious bit fixing sources. That is, we want a nearly-balanced whose output cannot be heavily influenced by any small set of bad bits (we’re considering for now), and in our case needs to be resilient even when the good bits are only -wise independent (and uniform), unlike the more standard case in which the good bits are completely independent. The second step of the CZ framework then amounts to applying a resilient function on , outputting .
While this beautiful approach does give us that , it is inherently bound to have runtime which is polynomial in . The Kahn–Kalai–Linial theorem [33] tells us that no matter what is, there will always be a single bad bit that has influence , i.e., with probability over the good bits, the single corrupt bit can fully determine the result.151515In terms of explicit -s, [13] derandomized the Ajtai-Linial [1] randomized construction, supporting for any constant . In a subsequent work, Meka [46] obtained a derandomized version of [1] that matches the randomized construction and can support bad bits (see also [30, 31] for followup constructions with smaller bias). Thus, the running time, which is at least , is also at least . This is a common feature of almost all constructions that follow the CZ approach.
The [3] Low-Error Adaptation
In [3], Ben-Aroya, Cohen, Doron, and Ta-Shma, evaded the above resilient functions barrier by aiming for a weaker object – a two source condenser. Namely, the output is not -close to uniform, but -close to some random variable that has min-entropy , where we call the entropy gap (note that an extractor has gap ). While when , a single malicious bit can bias the result pretty significantly, the key observation in [3] is that when becomes large, the probability that the bad bits can reduce bits of entropy from the output can be exponentially-small in . We call such a function , , an entropy-resilient function, which is essentially a condenser for non-oblivious symbol-fixing sources (that is, the natural extension of bit-fixing sources to arbitrary alphabets). In [3], they constructed an explicit such , and were able to conclude that is -close to having gap provided that is at least polynomial in , and . Importantly, the construction’s runtime is polynomial in , and not in .
Teleporting to the affine setting
Starting from the work of Li [42], and throughout a sequence of followup works [11, 10, 12, 44], the correlation breaking framework was used to extract from affine sources and other related families of weak sources such as sumset sources, small-space sources, and interleaved sources. While in the affine case we don’t have two sources at our disposal, our single source has structure we can utilize.
Specifically, recall our table above, and assume that is linear, meaning that for any fixed seed, the output is a linear function of the source. Here too, our table has many good rows (in fact, the good rows are exactly uniform), but they are, again, arbitrarily correlated. It turns out that we can use the same source to break the correlation and make the table close to a -non-oblivious bit-fixing source. This heavily uses the (by now) standard “affine conditioning” technique (see Lemma 7), in which given a linear function , we can decompose , where both and are affine, and there is a bijection between and , and is independent of .161616In our case, the function is chosen according to our linear seeded extractor , and in fact bundles up of them. For a more detailed description of the correlation breaking mechanism, see, for example, [42]. Conveniently, the intricate correlation breaking constructions for linearly-correlated source and seed was made explicit in [12], coined affine correlation breakers. The state-of-the-art affine correlation breakers follow from a recent work of Li [44] (see Definitions 13 and 14).
Our Low-Error Affine Condenser
Armed with a low-error two-source condenser, and a method to adapt the two-source setting to the affine world, a low-error affine condenser follows rather easily. Consider the table from Equation 1, but this time we form it as follows:
| (2) |
where is an affine correlation breaker, and is a linear seeded extractor. We then output
where is the above entropy resilient function. We can then show that if our affine source has at least entropy, is -close to having entropy gap . This is precisely a condensing guarantee, and the formal construction is given in Section 4.
2.3 Our OBF Extractor
Equipped with a linear condenser for OBF sources , and a low-error affine condenser for poly-logarithmic entropies , one natural attempt would be to try and output
where is a linear seeded extractor. Note that is a short affine source with entropy , so the entropy lower bound of is now much lower! There are three potential problems with the above construction:
-
1.
We want to supply the seed for , but is not uniform – it is only -close to having some entropy gap . But since is tiny, we can in fact use as a seed, suffering only a small loss in the error. (This observation is by now standard.)
-
2.
has length , and we don’t have linear seeded extractors outputting many bits with optimal seed length. In particular, the entropy in cannot be greater than , but all known extractors require seed .
To try and resolve the issue, one may change the construction to , thereby only working with the short . Even then, we are still left with the challenge of handling the correlations between the source and the seed.
-
3.
Indeed, depends on (or ) – it’s a deterministic function of it! As mentioned above, we will use the affine conditioning technique in order to handle the correlations. We decompose , where and are affine and independent, is a deterministic function of , and is independent of .
Morally, after an appropriate “fixing”, the source we use for is the affine source . But we can only guarantee that , which provides no useful bound if we want to retain most of the entropy of .
In order to make shorter and guarantee that some entropy is left even after the affine conditioning, we make use of linear OBF condensers in two ways:
Above,
-
is a lossless condenser for OBF sources, mapping bits to bits.
-
is a lossy condenser for OBF sources, mapping bits to bits for a small , while retaining entropy. We construct using error correcting codes as well, only with different parameters.
For to condense, we need , and this sets our lower bound on and the bound on the error.
Finally, we need to show that we can apply . Towards this end, we “affine condition” with respect to , and so we can write
| (3) |
where , and are independent, and is a deterministic function of . Thus, and are also independent, and furthermore, still has enough entropy. This, in turn, implies that has enough entropy, and we can safely fix a good seed , making the first term in Equation 3 fixed, and the second one close to uniform. The full details appear in Section 5.2.
3 Preliminaries
We use for . For an integer , we denote by the set . The density of a subset is denoted by . For a function , we say that is explicit if there exists a deterministic procedure that runs in time and computes .
Random variables, entropy
The support of a random variable distributed over some domain is the set for which , which we denote by . The total variation distance (or, statistical distance) between two random variables and over the same domain is defined as . Whenever we say that is -close to and denote it by . We denote by the random variable distributed uniformly over . We say a random variable is flat if it is uniform over its support. Whenever we write for being a set, we mean is sampled uniformly at random from the flat distribution over .
For a function and a random variable distributed over , is the random variable distributed over obtained by choosing according to and computing . For a set , . For every and two random variables and distributed over it holds that , and is often referred to as a data-processing inequality. Another property of statistical distance is the triangle inequality, which states that for all distributions , we have .
The min-entropy of is defined by
and it always holds that , where is Shannon’s entropy. A random variable is an source if is distributed over and has min-entropy at least .
Limited Independence
We say that a distribution is -wise independent, if the restriction of to any coordinates is -close to . When , we simply say that is -wise independent.
Lemma 4 ([2]).
Let be -wise independent. Then, is -close to a -wise independent distribution.
3.1 OBF and Affine Sources
We repeat the definition of affine and OBF sources.
Definition 5 (affine source).
An affine source is a distribution that is flat over some (unknown) affine subspace of dimension .
In other words, for any such there exist independent such that sampling amounts to sampling uniformly at random, and outputting . Notice that when is affine, .
Definition 6 (OBF source).
An oblivious bit-fixing (OBF) source is a distribution for which there exists an (unknown) subset of size , and , such that , and is fixed to .
In other words, a bit-fixing source is a very structured affine source – one in which are indicator vectors. The following lemma will let us use linearly-correlated source and seed when using linear seeded extractors.
Lemma 7 (affine conditioning, [25, 47, 37]).
Let be an affine source, and let be a linear function. Then, there exist independent affine sources and , over , such that:
-
.
-
There exists such that for every it holds that .
-
and there exists an affine function such that . In particular, and .
-
For every , .
3.2 Seeded Extractors
Definition 8 (seeded extractor).
A function is a (seeded) extractor if for every source and an independent and uniform , it holds that . Furthermore, we say that is strong if .
We say that is linear if it is linear in the source, namely if for any and , it holds that .
Combining a (linear variant) of the GUV seeded condenser [28, 16] with the Shaltiel–Umans extractor [51], we can get the following logarithmic-seed linear seeded extractor, that outputs a constant power of the entropy.
Theorem 9 (linear seeded extractor, I [42]).
There exists a constant such that the following holds, for any positive integers and , and any . There exists an explicit linear strong extractor with and .
Replacing the SU extractor with Trevisan’s extractor [52], as analyzed in [48], we can output almost all the entropy, at the cost of a larger seed.
Theorem 10 (linear seeded extractor, II).
There exists a constant such that the following holds, for any positive integers and , and any . There exist an explicit linear strong extractor where and
When the extractor is linear and the source is affine, a good seed already implies perfect uniformity, for any nontrivial error.
Lemma 11 ([47]).
Let be a linear strong extractor with . Then, for every affine source, it holds that
We will also need the following claim, that is often used when one wishes to use seeded extractors with a non perfect seed.
Claim 12 (strong extractors with weak seeds).
Let be a strong extractor, let be an source, and let be -close to a source which is independent of . Then, with probability at least over , it holds that .
Proof.
By Markov’s inequality, there exists a set of density such that for every it holds that . Let be a source that is -close to . Then,
3.3 Affine Correlation Breakers
We proceed with formally defining affine correlation breakers (with advice).
Definition 13 (affine correlation breaker).
We say that is a affine correlation breaker if for all distributions , , , , and all strings such that:
-
and for all ,
-
and is uniform,
-
is independent of , and,
-
For all , ,
it holds that
We say that is strong if the above holds also when we add .
We will use the following recent affine correlation breaker.
Theorem 14 ([44]).
For any positive integers , and any , there exists an explicit strong affine correlation breaker , where and .
3.4 Entropy-Resilient Functions
We summarize the required definitions and results given in [3].
Definition 15 (NOSF sources).
Let . A non-oblivious -fixing source is a random variable over for which there exists a set of cardinality such that:
-
The joint distribution of , denoted by , is -wise independent over ; and
-
Each of the random variables in with and may depend arbitrarily on all other random variables in and .
If , we say that is a -non-oblivious -fixing source. If , the definition coincides with the standard definition of non-oblivious bit-fixing sources.
In [3], Ben-Aroya et al. constructed seedless condensers for NOSF sources, also known as entropy-resilient functions.
Definition 16 (entropy-resilient functions).
Let . A function is a entropy-resilient function if for every non-oblivious -fixing source over , the output is -close to an -source.
Theorem 17 ([3]).
For every constant there exist constants and such that the following holds. For all integers , , every , and for every integer , there exists an explicit function , for , that is entropy-resilient, with entropy gap .
3.5 Error Correcting Codes
A binary code is linear if is a linear subspace of . The dimension of as a subspace is called the dimension of the code. We will identify a linear code of dimension with the image of its encoding function , which is given by a generator matrix . A parity check matrix of is a generator matrix of the dual code , and thus if and only if .
We say that an error correcting code has distance if for any distinct codewords it holds that in at least -s. We say that is an code, if is a linear code of dimension and distance . We will use the following explicit code, the BCH code.
Theorem 18 ([7, 29]).
There exists a constant such that the following holds. For every positive integers and , there exists an code with co-dimension .171717The original BCH code assumes for some integer , and then . Standard manipulations let us work for any . Moreover, the parity-check matrix of can be constructed in time .
4 Low-Error Affine Condensers
In this section, we give our construction of low-error affine condensers with tiny entropy gap.
Theorem 19.
There exist universal constants and such that the following holds. For any positive integers , and any such that , there exists an explicit function where such that for any affine source , it holds that is -close to an source, where .
We will start by reducing an affine source to a non-oblivious symbol-fixing (NOSF) source.
Lemma 20.
There exist constants such that the following holds, for any positive integers , and any , satisfying There exists an explicit function with and such that for every affine source , is -close to a NOSF source.
Proof.
We use the following two building blocks.
-
Let be the affine correlation breaker of Theorem 14 with , and
We need to make sure that where , and indeed, this is satisfied by taking for a large enough constant . We also need to make sure that and , and both inequalities, together with our previous lower bound on , indeed holds whenever
for a large enough constant .
Our construction goes as follows. For simplicity, identify with . Given :
-
1.
For every , compute .
-
2.
For every , compute .
We’ll show that the table satisfies the requirement of the lemma. First, by Lemma 11, we have a set of density such that for any it holds that is uniform. Next, fix good rows . Consider the linear function that is given by . Via the affine conditioning lemma, Lemma 7, we can write , where:
-
and are affine and independent.
-
.
-
is constant, so is independent of , and moreover, is independent of .
Recalling that each is uniform, all conditions are met to apply are affine correlation breaker, which gives us
for every . Therefore,
and so by Lemma 4, itself is close to a non-oblivious -fixing source for .
Next, we can simply apply the low-error condenser for NOSF sources given by [3].
Proof of Theorem 19.
Given and , let be the number of bits we eventually output, and the entropy of will be determined according to that later on. Set , and let and be the constants that are set according to , guaranteed to us by Theorem 17.
Let be the function from Lemma 20, where and , set with and . By that lemma, there exists a constant such that whenever and is an affine source, it holds that is -close to an NOSF source.
Now, let be the entropy resilient function from Theorem 17. Whenever , we are guaranteed that is -close to an source with . To conclude, we have that there exist constants and such that whenever and , the function
is an affine condenser with entropy gap . Putting it differently, there exist constants and such that whenever , our condenser outputs bits.
5 Extractors for OBF Sources
5.1 Linear OBF Condensers
We begin by defining these condensers, and focus on the linear setting.
Definition 21 (OBF condenser).
A matrix is an linear condenser for OBF sources if for any OBF source .
Rao showed that linear condensers for OBF sources can be obtained by parity check matrices of binary codes. The next statement is a bit different than the one in [47], so we provide the proof for completeness.181818The connection between pseudorandomness for OBF sources and codes was established, at least for extractors, already in [17], and, possibly under different formulations, in the cryptography literature, e.g, in [15].
Lemma 22 (following [47]).
Let be an OBF source, let be any positive integer, and let be an code with a parity check matrix , where and . Then, satisfies .
Proof.
Letting be the entropic coordinates of , and let be its constant shift.191919We can assume that is zero on . Consider
observing that each element of has Hamming weight at most . Since has distance greater than , and so is injective on and thus also on . Let be the subspace for which , and also refer to and as the corresponding flat distributions. Thus, for each ,
and we are done.
Using the BCH codes from Theorem 18, we get the following linear condensers for OBF sources.
Corollary 23.
There exists a constant such that the following holds. For every positive integers , , and , there exists an explicit linear function for , such that the following holds. For every OBF source , it holds that . That is, is an linear condenser for OBF sources.
5.2 OBF Extractors for Almost-Logarithmic Entropy
We are given , and . Let be our error guarantee, to be determined later on. We use the following ingredients:
-
Let be the linear condenser for OBF sources given in Corollary 23, set with , so for some universal constant .
-
Let be the linear condenser for OBF sources given in Corollary 23, set with , so .
-
Let be the affine condenser from Theorem 19, set with and .
-
Let be the extractor of Theorem 10, where and . There exists a constant such that a seed of length suffices.
Now, given , our extractor is given by
The fact that is explicit readily follows from the explicitness of its components. The correctness is established in the following theorem.
Theorem 24.
There exist universal constants and such that the following holds, assuming and . For every OBF source it holds that for , recalling that .
Proof.
Denote , , , and . We “affine condition” (Theorem 19) according to , and so we can write as , where and are affine and independent and there exists a linear bijection between and (so is independent of and thus also of ). We can then write
We know that , , and . Also, , and and are independent. Thus, as well,202020To see this, recall that is a subspace of (up to a shift), and is injective on . and is independent of .
Claim 25.
With probability at least over it holds that .
Proof.
To apply our affine condenser, Theorem 19, we need to make sure that:
-
1.
, where is the constant guaranteed to us by Theorem 19. Recall that
and that
so the inequality is met as long as and for some constants and .
-
2.
, where is the constant guaranteed to us by Theorem 19. Plugging in the expression for above, we get that similarly, we need to satisfy:
-
(a)
for some constants ,
-
(b)
for some constant , and,
-
(c)
for some constant .
-
(a)
So indeed, Theorem 19 tells us that itself is -close to a -source for , and thus by Claim 12 we have that except for probability
Fix a good (in the sense of the above claim), and observe that is independent from , making
close to uniform as well. Overall, , and we are done.
Choosing for any constant , we obtain our Theorem 3.
5.3 Handling Low-Weight Affine Sources
We say that an affine source has weight if each basis element of has weight at most . This generalizes OBF sources (that have ), and [47] can handle for a small constant . Note that the only place in the proof that we used the fact that is an OBF source (rather than a full-fledged affine source) is Lemma 22, where we argued that codes with sufficiently large distance give linear condensers. This can easily be extended to the case.
Lemma 26 (following [47]).
Let be an affine source of weight , let be any positive integer, and let be an code with a parity check matrix , where and . Then, satisfies .
Thus, as grows, the co-dimension of the code we need to take grows too, and so the condensing quality decreases. To modify our construction to handle , we need to set and . Repeating the same analysis as in Section 5.2, we see that one can take without significant loss in parameters.
Theorem 27.
For any constant there exist constants and such that the following holds for any and . There exists an explicit function
that is a extractor for affine sources of weight , where .
References
- [1] Miklós Ajtai and Nathan Linial. The influence of large coalitions. Combinatorica, 13(2):129–145, 1993. doi:10.1007/BF01303199.
- [2] Noga Alon, Oded Goldreich, and Yishay Mansour. Almost k-wise independence versus k-wise independence. Information Processing Letters, 88(3):107–110, 2003. doi:10.1016/S0020-0190(03)00359-4.
- [3] Avraham Ben-Aroya, Gil Cohen, Dean Doron, and Amnon Ta-Shma. Two-source condensers with low error and small entropy gap via entropy-resilient functions. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM), pages 43:1–43:20. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2019. doi:10.4230/LIPIcs.APPROX-RANDOM.2019.43.
- [4] Avraham Ben-Aroya, Dean Doron, and Amnon Ta-Shma. An efficient reduction from two-source to nonmalleable extractors:: Achieving near-logarithmic min-entropy. SIAM Journal on Computing, 51(2):STOC17–31, 2019.
- [5] Avraham Ben-Aroya, Dean Doron, and Amnon Ta-Shma. Near-optimal erasure list-decodable codes. In Computational Complexity Conference (CCC), pages 1:1–1:27. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2020. doi:10.4230/LIPIcs.CCC.2020.1.
- [6] Charles H. Bennett, Gilles Brassard, and Jean-Marc Robert. How to reduce your enemy’s information. In Advances in Cryptology (CRYPTO), volume 218 of Lecture Notes in Computer Science, pages 468–476. Springer, 1985.
- [7] Raj Chandra Bose and Dwijendra K. Ray-Chaudhuri. On a class of error correcting binary group codes. Information and control, 3(1):68–79, 1960. doi:10.1016/S0019-9958(60)90287-4.
- [8] Jean Bourgain. More on the sum-product phenomenon in prime fields and its applications. International Journal of Number Theory, 1(01):1–32, 2005.
- [9] Ran Canetti, Yevgeniy Dodis, Shai Halevi, Eyal Kushilevitz, and Amit Sahai. Exposure-resilient functions and all-or-nothing transforms. In Advances in Cryptology (EUROCRYPT), pages 453–469. Springer, 2000. doi:10.1007/3-540-45539-6_33.
- [10] Eshan Chattopadhyay, Jesse Goodman, and Jyun-Jie Liao. Affine extractors for almost logarithmic entropy. In Annual Symposium on Foundations of Computer Science (FOCS), pages 622–633. IEEE, 2022.
- [11] Eshan Chattopadhyay and Xin Li. Explicit non-malleable extractors, multi-source extractors, and almost optimal privacy amplification protocols. In Annual Symposium on the Foundations of Computer Science (FOCS), pages 158–167. IEEE, 2016. doi:10.1109/FOCS.2016.25.
- [12] Eshan Chattopadhyay and Jyun-Jie Liao. Extractors for sum of two sources. In Annual Symposium on Theory of Computing (STOC), pages 1584–1597. ACM, 2022. doi:10.1145/3519935.3519963.
- [13] Eshan Chattopadhyay and David Zuckerman. Explicit two-source extractors and resilient functions. Annals of Mathematics, 189(3):653–705, 2019.
- [14] Ruiwen Chen, Valentine Kabanets, Antonina Kolokolova, Ronen Shaltiel, and David Zuckerman. Mining circuit lower bound proofs for meta-algorithms. computational complexity, 24:333–392, 2015. doi:10.1007/S00037-015-0100-0.
- [15] Mahdi Cheraghchi, Fredric Didier, and Amin Shokrollahi. Invertible extractors and wiretap protocols. IEEE Transactions on Information Theory, 58(2):1254–1274, 2011. doi:10.1109/TIT.2011.2170660.
- [16] Mahdi Cheraghchi and Piotr Indyk. Nearly optimal deterministic algorithm for sparse Walsh-Hadamard transform. ACM Transactions on Algorithms (TALG), 13(3):1–36, 2017. doi:10.1145/3029050.
- [17] Benny Chor, Oded Goldreich, Johan Håstad, Joel Freidmann, Steven Rudich, and Roman Smolensky. The bit extraction problem or t-resilient functions. In Annual Symposium on Foundations of Computer Science (FOCS), pages 396–407. IEEE, 1985.
- [18] Gil Cohen. Local correlation breakers and applications to three-source extractors and mergers. SIAM Journal on Computing, 45(4):1297–1338, 2016. doi:10.1137/15M1029837.
- [19] Gil Cohen. Towards optimal two-source extractors and Ramsey graphs. In Annual Symposium on Theory of Computing (STOC), pages 1157–1170. ACM, 2017. doi:10.1145/3055399.3055429.
- [20] Gil Cohen and Igor Shinkar. Zero-fixing extractors for sub-logarithmic entropy. In International Colloquium on Automata, Languages, and Programming (ICALP), pages 343–354. Springer, 2015. doi:10.1007/978-3-662-47672-7_28.
- [21] Yevgeniy Dodis. Exposure-Resilient Cryptography. Phd thesis, MIT, 2000.
- [22] Yevgeniy Dodis, Amit Sahai, and Adam Smith. On perfect and adaptive security in exposure-resilient cryptography. In Advances in Cryptology (EUROCRYPT), pages 301–324. Springer, 2001. doi:10.1007/3-540-44987-6_19.
- [23] Cynthia Dwork, Krishnaram Kenthapadi, Frank McSherry, Ilya Mironov, and Moni Naor. Our data, ourselves: Privacy via distributed noise generation. In Advances in Cryptology (EUROCRYPT), pages 486–503. Springer, 2006. doi:10.1007/11761679_29.
- [24] Joel Friedman. On the bit extraction problem. In Annual Symposium on Foundations of Computer Science (FOCS), volume 33, pages 314–314. IEEE, 1992.
- [25] Ariel Gabizon and Ran Raz. Deterministic extractors for affine sources over large fields. Combinatorica, 28(4):415–440, 2008. doi:10.1007/S00493-008-2259-3.
- [26] Ariel Gabizon, Ran Raz, and Ronen Shaltiel. Deterministic extractors for bit-fixing sources by obtaining an independent seed. SIAM Journal on Computing, 36(4):1072–1094, 2006. doi:10.1137/S0097539705447049.
- [27] Jesse Patrick McGrenra Goodman. Seedless Extractors. Phd thesis, Cornell University, 2023. Available at https://jpmgoodman.com/thesis.pdf.
- [28] Venkatesan Guruswami, Christopher Umans, and Salil Vadhan. Unbalanced expanders and randomness extractors from parvaresh–vardy codes. Journal of the ACM (JACM), 56(4):20, 2009.
- [29] Alexis Hocquenghem. Codes correcteurs d’erreurs. Chiffers, 2:147–156, 1959.
- [30] Peter Ivanov, Raghu Meka, and Emanuele Viola. Efficient resilient functions. In Annual Symposium on Discrete Algorithms (SODA), pages 2867–2874. ACM-SIAM, 2023. doi:10.1137/1.9781611977554.CH108.
- [31] Peter Ivanov and Emanuele Viola. Resilient functions: Optimized, simplified, and generalized. arXiv preprint arXiv:2406.19467, 2024. doi:10.48550/arXiv.2406.19467.
- [32] Markus Jakobsson, Julien P. Stern, and Moti Yung. Scramble all, encrypt small. In International Workshop on Fast Software Encryption, pages 95–111. Springer, 1999. doi:10.1007/3-540-48519-8_8.
- [33] Jeff Kahn, Gil Kalai, and Nathan Linial. The influence of variables on boolean functions. In Annual Symposium on Foundations of Computer Science (FOCS), pages 68–80. IEEE, 1988.
- [34] Jesse Kamp and David Zuckerman. Deterministic extractors for bit-fixing sources and exposure-resilient cryptography. SIAM Journal on Computing, 36(5):1231–1247, 2006. doi:10.1137/S0097539705446846.
- [35] Ilan Komargodski, Ran Raz, and Avishay Tal. Improved average-case lower bounds for demorgan formula size. In Annual Symposium on Foundations of Computer Science (FOCS), pages 588–597. IEEE, 2013. doi:10.1109/FOCS.2013.69.
- [36] Mark Lewko. An explicit two-source extractor with min-entropy rate near 4/9. Mathematika, 65(4):950–957, 2019.
- [37] Xin Li. A new approach to affine extractors and dispersers. In Computational Complexity Conference (CCC), pages 137–147. IEEE, 2011. doi:10.1109/CCC.2011.27.
- [38] Xin Li. Extractors for a constant number of independent sources with polylogarithmic min-entropy. In Annual Symposium on the Foundations of Computer Science (FOCS), pages 100–109. IEEE, 2013. doi:10.1109/FOCS.2013.19.
- [39] Xin Li. New independent source extractors with exponential improvement. In Annual Symposium on Theory of Computing (STOC), pages 783–792. ACM, 2013. doi:10.1145/2488608.2488708.
- [40] Xin Li. Three-source extractors for polylogarithmic min-entropy. In Annual Symposium on Foundations of Computer Science (FOCS), pages 863–882. IEEE, 2015. doi:10.1109/FOCS.2015.58.
- [41] Xin Li. Three-source extractors for polylogarithmic min-entropy. In 2015 IEEE 56th Annual Symposium on Foundations of Computer Science, pages 863–882. IEEE, 2015. doi:10.1109/FOCS.2015.58.
- [42] Xin Li. Improved two-source extractors, and affine extractors for polylogarithmic entropy. In Annual Symposium on Foundations of Computer Science (FOCS), pages 168–177. IEEE, 2016. doi:10.1109/FOCS.2016.26.
- [43] Xin Li. Non-malleable extractors and non-malleable codes: Partially optimal constructions. In Computational Complexity Conference (CCC), pages 28:1–28:49. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2019. doi:10.4230/LIPICS.CCC.2019.28.
- [44] Xin Li. Two source extractors for asymptotically optimal entropy, and (many) more. In Annual Symposium on Foundations of Computer Science (FOCS), pages 1271–1281. IEEE, 2023. doi:10.1109/FOCS57990.2023.00075.
- [45] Xin Li and Yan Zhong. Explicit directional affine extractors and improved hardness for linear branching programs. In Computational Complexity Conference (CCC), pages 10:1–10:14. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2024. doi:10.4230/LIPIcs.CCC.2024.10.
- [46] Raghu Meka. Explicit resilient functions matching Ajtai-Linial. In Annual Symposium on Discrete Algorithms (SODA), pages 1132–1148. ACM-SIAM, 2017. doi:10.1137/1.9781611974782.73.
- [47] Anup Rao. Extractors for low-weight affine sources. In Computational Complexity Conference (CCC), pages 95–101. IEEE, 2009. doi:10.1109/CCC.2009.36.
- [48] Ran Raz, Omer Reingold, and Salil Vadhan. Extracting all the randomness and reducing the error in trevisan’s extractors. Journal of Computer and System Sciences, 65(1):97–128, 2002. doi:10.1006/JCSS.2002.1824.
- [49] Ronald L Rivest. All-or-nothing encryption and the package transform. In Fast Software Encryption: 4th International Workshop, FSE’97 Haifa, Israel, January 20–22 1997 Proceedings 4, pages 210–218. Springer, 1997. doi:10.1007/BFB0052348.
- [50] Ronen Shaltiel. How to get more mileage from randomness extractors. Random Structures & Algorithms, 33(2):157–186, 2008. doi:10.1002/RSA.20207.
- [51] Ronen Shaltiel and Christopher Umans. Simple extractors for all min-entropies and a new pseudorandom generator. Journal of the ACM (JACM), 52(2):172–216, 2005. doi:10.1145/1059513.1059516.
- [52] Luca Trevisan. Extractors and pseudorandom generators. Journal of the ACM (JACM), 48(4):860–879, 2001. doi:10.1145/502090.502099.
- [53] Umesh V. Vazirani. Strong communication complexity or generating quasi-random sequences from two communicating semi-random sources. Combinatorica, 7(4):375–392, 1987. doi:10.1007/BF02579325.
- [54] John von Neumann. Various techniques used in connection with random digits. Applied Math Series, 12(36-38):5, 1951.
- [55] Amir Yehudayoff. Affine extractors over prime fields. Combinatorica, 31(2):245–256, 2011. doi:10.1007/S00493-011-2604-9.
