Near-Optimal Averaging Samplers and Matrix Samplers
Abstract
We present the first efficient averaging sampler that achieves asymptotically optimal randomness complexity and near-optimal sample complexity. For any and any constant , our sampler uses random bits to output samples such that for any function ,
The randomness complexity is optimal up to a constant factor, and the sample complexity is optimal up to the factor.
Our technique generalizes to matrix samplers. A matrix sampler is defined similarly, except that and the absolute value is replaced by the spectral norm. Our matrix sampler achieves randomness complexity and sample complexity for any constant , both near-optimal with only a logarithmic factor in randomness complexity and an additional exponent on the sample complexity.
We use known connections with randomness extractors and list-decodable codes to give applications to these objects. Specifically, we give the first extractor construction with optimal seed length up to an arbitrarily small constant factor above 1, when the min-entropy for a large enough constant . Finally, we generalize the definition of averaging sampler to any normed vector space.
Keywords and phrases:
Pseudorandomness, Averaging Samplers, Randomness ExtractorsFunding:
Zhiyang Xun: Supported by NSF Grant CCF-2312573, a Simons Investigator Award (#409864, David Zuckerman), NSF award CCF-2008868 and the NSF AI Institute for Foundations of Machine Learning (IFML).Copyright and License:
2012 ACM Subject Classification:
Theory of computation Pseudorandomness and derandomizationAcknowledgements:
We thank Kuan Cheng for introducing us to the matrix sampler problem. We thank Shravas Rao for simplifying and slightly improving Lemma 30 (although this didn’t improve our final result). We thank Oded Goldreich, Dana Moshkovitz, Amnon Ta-Shma, Salil Vadhan, and anonymous reviewers for helpful comments.Editors:
Srikanth SrinivasanSeries and Publisher:
Leibniz International Proceedings in Informatics, Schloss Dagstuhl – Leibniz-Zentrum für Informatik
1 Introduction
Randomization plays a crucial role in computer science, offering significant benefits across various applications. However, obtaining true randomness can be challenging. It’s therefore natural to study whether we can achieve the benefits of randomization while using few random bits.
One of the most basic uses of randomness is sampling. Given oracle access to an arbitrary function on a large domain, our goal is to estimate its average value. By drawing independent random samples , the Chernoff bound guarantees that the average value with probability at least . This method uses full independence in sampling, but more efficient strategies can be pursued. This leads to the following definition:
Definition 1 ([6]).
A function is a averaging sampler with samples using random bits if for every function , we have
The goal is to construct explicit samplers using a small number of random bits that have sample complexity close to the optimal. Researchers have made significant progress toward this goal, and a summary is presented in Table 1. Bellare and Rompel [6] suggested that interesting choices of parameters are and , which allow us to use random bits and generate samples. For simplicity, we assume throughout the paper (see Remark 10 for further discussion).
| Reference | Method | Random Bits | Sample Complexity |
|---|---|---|---|
| [10] | Lower Bound | ||
| [10] | Non-Explicit | ||
| Standard | Full Independence | ||
| [12] | Pairwise Independence | ||
| [17] | Expander Walks | ||
| [6] | Iterated Sampling | ||
| [43] | Hash-Based Extractors | ||
| [31] | Zig-Zag Extractors | ||
| Theorem 1 | Compose [31] | ||
| With Almost -wise Ind. |
The best existing randomness-efficient averaging sampler comes from the equivalence between averaging samplers and extractors [43], which we will elaborate on later in the paper. Improving Zuckerman’s construction, Reingold, Vadhan, and Wigderson [31] provided a averaging sampler for domain that uses random bits for any positive constant . This almost matches the lower bound in [10]. However, a notable gap remains in sample complexity: the existing construction’s complexity does not align with the optimal . This raised the following open problem (see, e.g., [41, Open Problem 4.24], [18, Section 6]):
Problem 1.
Can we explicitly design a averaging sampler for domain that uses random bits and only samples?
We note that such algorithms do exist for general samplers, which query and estimate through a more complicated computation than taking the average [5]. However, many applications require the use of averaging samplers, such as the original use in interactive proofs [6]. Beyond these applications, averaging samplers act as a fundamental combinatorial object that relate to other notions such as randomness extractors, expander graphs, and list-decodable codes [43, 39].
1.1 Our Averaging Sampler
In this paper, we construct a polynomial-time computable averaging sampler with near-optimal sample complexity using an asymptotically optimal number of random bits. In fact, the sampler we constructed is a strong sampler, defined as follows:
Definition 2.
A averaging sampler Samp is strong if for every sequence of functions , we have
We now state our main theorems about averaging samplers, which follow from a more general theorem that is slightly more complicated to state, Theorem 21.
Theorem 1.
For every constant , there exists an efficient strong averaging sampler for domain that uses random bits and samples.
This nearly resolves Problem 1. We also give a sampler with asymptotically optimal sample complexity but a worse randomness complexity.
Theorem 2.
There exists an efficient strong averaging sampler for domain that uses random bits and samples.
1.2 Matrix Samplers
A natural generalization of the classic Chernoff bound is the Matrix Chernoff Bound [33, 2, 37]. Suppose we wish to estimate for a matrix-valued function satisfying . By drawing independent random samples , the Matrix Chernoff Bound guarantees that
where denotes the spectral norm. As in the real-valued case, we wish to derandomize this process without increasing the sample complexity too much. To address this, Wigderson and Xiao [42] initiated the study of randomness-efficient matrix samplers:
Definition 3.
A function is a -dimensional matrix sampler with samples using random bits if the following holds: For any function such that for all , we have
Extending the construction of non-explicit standard averaging samplers [10], we can show that there exists a non-explicit matrix sampler that requires only an additional bits of randomness compared to averaging samplers while achieving asymptotically optimal sample complexity.
Proposition 4.
There exists a (non-explicit) -dimensional matrix sampler for domain using samples and random bits.
However, explicitly constructing randomness-efficient matrix samplers turns out to be very challenging. While a union bound over matrix entries suggests that a randomness-optimal averaging sampler directly implies a randomness-optimal matrix sampler (see Lemma 17), this method incurs an unavoidable factor in sample complexity, making the dependence on exponentially worse than optimal. This raises an open question: can we construct a matrix sampler with (nearly) optimal randomness complexity and polynomial sample complexity, analogous to the averaging samplers in [6] and [43]?
Problem 2.
Can we explicitly design a -dimensional matrix sampler for domain that uses random bits and samples?
| Reference | Method | Random Bits | Sample Complexity |
|---|---|---|---|
| Proposition 4 | Non-Explicit | ||
| Standard | Full Independence | ||
| [42] | Union Bound Over Entries | ||
| [16] | Expander Walks | ||
| Theorem 3 | Iterated Sampler Composition |
We summarize prior matrix sampler constructions in Table 2. The best existing construction, a matrix analog of the expander walks sampler, was provided by Garg, Lee, Song, and Srivastava [16]. Similar to expander walks for real-valued sampling, this construction gives asymptotically optimal sample complexity, but the randomness complexity is worse than optimal by a factor. We note that even if we allow the the matrix sampler to be non-averaging, no known better construction is currently known.
In this work, we construct a polynomial-time computable matrix sampler with near-optimal randomness and sample complexity. The randomness complexity is optimal up to a logarithmic factor, and the sample complexity is within a factor of optimal for arbitrarily small constant . This brings us close to resolving Problem 2.
Theorem 3.
For any constant : There exists an efficient -dimensional matrix sampler for domain that uses random bits and samples.
Additionally, we construct a matrix sampler achieving asymptotically optimal randomness complexity, though at the cost of increased sample complexity. This breaks the barrier in sample complexity for randomness-optimal matrix samplers.
Theorem 4.
For any constant , there exists an efficient -dimensional matrix sampler for domain that uses random bits and samples.
1.3 Samplers for General Normed Vector Space
Apart from spectral norms of matrices, it is natural to study the averaging-sampling problem in other normed vector spaces .
Definition 5.
A function is a -sampler for a normed space and a class of functions if, for every ,
We call Samp a -sampler when .
Under this definition, a -dimensional matrix sampler is precisely a -sampler. Previous work also studied -samplers for a broader class of , such as subgaussian or subexponential real-valued functions [8, 1]. Extending our construction to other normed spaces and broader function classes remains an interesting direction for future research.
1.4 Randomness Extractors
Our sampler construction has implications for randomness extractors. A randomness extractor is a function that extracts almost-uniform bits from a low-quality source of randomness. We define the quality of a random source as its min-entropy.
Definition 6.
The min-entropy of a random variable is
An -source is a random variable on bits with min-entropy at least .
Then a randomness extractor is defined as:
Definition 7 ([28]).
A function is a extractor if for every -source , the distribution . We say Ext is a strong extractor if for every -source , the distribution , where is chosen from .
Randomness extractors are essential tools in theoretical computer science. However, there has been little study of explicit extractors with the right dependence on for vanishing . This is a particular concern in cryptography, where extractors are widely used as building blocks and security requirements demand superpolynomially small [26, 40, 9, 14, 24, 23, 15]. Existentially, there are extractors with seed length , and there is a matching lower bound [29].
Zuckerman [43] showed that averaging samplers are essentially equivalent to extractors. Specifically, an extractor can be seen as a sampler that generates as its -th sample point using the random source . Using this equivalence, we give the first extractor construction with optimal seed length up to an arbitrarily small constant factor bigger than 1, when the min-entropy for a large enough constant .
Theorem 5.
For every constant , there exists constant such that for all and , there is an efficient strong extractor with and .
Prior to our work, extractors with a seed length dependence on achieving or close to it were based on the leftover hash lemma [7, 21, 20] and expander random walks [17, 44]. Extractors using the leftover hash lemma have a seed length of , which is far from optimal. Expander random walks give a extractor with and an optimal seed length of . Our extractor is better than expander walks for all vanishing by allowing smaller entropy .
In fact, if we aim to remove the and achieve the optimal seed length of to match expander random walks, we can set in Theorem 21 and get the following extractor for entropy rate for :
Theorem 6.
There exists constant such that for all and , there is an efficient strong extractor with and .
This is better than expander random walks’ entropy rate of for all .
1.5 List-Decodable Codes
Another perspective on averaging samplers is its connection to error-correcting codes. Ta-Shma and Zuckerman [35] showed that strong randomness extractors are equivalent to codes with good soft-decision decoding, which is related to list recovery. From this perspective, the composition scheme in our construction is similar to code concatenation.
For codes over the binary alphabet, soft decision decoding amounts to list decodability, which we focus on here.
We give good list-decodable codes without using the composition. That is, by just applying our almost -wise independence sampler on the binary alphabet, we can get a binary list-decodable code with rate and non-trivial list size, although the list size is still exponential.
Theorem 7.
For every constant : there exists an explicit binary code with rate that is list-decodable with list size for some constant .
Prior to our work, the best known code rate was by Guruswami and Rudra [19]. We emphasize that their code achieved a list size of , while our list size is exponentially large, making our code unlikely to be useful.
1.6 Techniques
1.6.1 Averaging Samplers
Our construction of the averaging sampler is very simple, and is based on two observations:
-
1.
Rather than querying every sample point produced by a sampler Samp, we can use an inner sampler to select a subset of samples for querying. This sub-sampling approach has been utilized in previous sampler constructions [6, 18]. Although incurs an additional randomness cost, the final sample complexity depends only on , leading to reduced overall sample complexity. Since the domain of is much smaller than the original domain, we can leverage more efficient sampling strategies.
-
2.
The bottleneck of generating an almost -wise independent sequence over a large domain lies in sampling independent random points, which costs random bits. Since we can only afford random bits, we are restricted to generating constant-wise independent samples. However, for a much smaller domain, we can use few random bits to generate an almost -wise independent sequence for large .
Our construction is outlined as follows. Let be the extractor-based sampler in [31]. Let be an almost -wise independent sequence over domain , thinking of . Our sampler is then defined by
In this construction, we use the almost -wise independent sequence to sub-sample from the extractor-based sampler. This can be viewed as a composition, similar to other cases such as Justesen codes [22] and the first PCP theorem [4], where the goal is to optimize two main parameters simultaneously by combining two simpler schemes, each optimizing one parameter without significantly compromising the other.
Previous works have applied almost -wise independence in extractor constructions. Srinivasan and Zuckerman [34] proved a randomness-efficient leftover hash lemma by sampling an almost -wise independent function using uniform seeds and output , where is the weak random source. From an extractor perspective, our inner sampler takes an inverse approach: we use to pick a function in the space of almost -wise independent functions, and then output . Furthermore, Raz’s two-source extractor [30] follows a more general framework, where two weak random sources to sample are used – one to sample an almost -wise independent function and the other as its input. However, directly applying Raz’s error bound in our analysis (Lemma 20) results in a sample complexity that is off by a factor.
1.6.2 Matrix Samplers
Using the connection between averaging samplers and matrix samplers (see Lemma 17), our averaging sampler directly implies a matrix sampler using random bits and samples. This already gives the best randomness-optimal matrix sampler to date; however, its sample complexity has exponentially worse dependence on than optimal.
Our sub-sampling technique using almost -wise independence offers a way to further reduce sample complexity. The composition of samplers only depends on the triangle inequality, which also applies to spectral norms. The remaining task is to verify that almost -wise independence also provides good concentration bounds for matrix sampling, which is straightforward given the extensive literature on moment inequalities for random matrices [11, 25, 38].
Applying this composition, we get a matrix sampler using random bits and samples, as described in Theorem 4. This is close to optimal for cases where , though it is not yet sufficient for larger .
However, we can apply composition recursively. By repeating the composition times, the dependence on becomes . Each round of composition costs an additional random bits, resulting in a matrix sampler using random bits and samples. This already gives a matrix sampler using random bits and near-optimal sample complexity.
To further improve the dependence on in randomness complexity and achieve the bound in Theorem 3, we introduce an alternative way of composing samplers:
Proposition 8.
Suppose we are given two efficient matrix samplers:
-
Let be a matrix sampler.
-
Let be a averaging sampler.
Then, for uniformly random sources ,
is an efficient matrix sampler for domain with samples using random bits.
This essentially says, composing a good matrix sampler and a good standard averaging sampler would give a good matrix sampler. Although this slightly increases the sample complexity, we can use our sub-sampling technique to reduce it later.
Unlike sub-sampling, in the composition of Proposition 8 generates multiple random seeds for , and we query all the samples it produces. This approach effectively reduces the error probability of from to . The key idea is that only an fraction of the seeds generated by lead to failure in , contributing only a tolerable additive error in the estimate of . The reasoning is straightforward: at most an fraction of all possible seeds for cause failure, and with probability , does not oversample these failure seeds by more than an additional proportion. As a result, the final proportion of failure seeds remains bounded by .
Remark 9.
We can also define strong matrix samplers as a matrix analog of strong averaging samplers. All results for matrix samplers in this paper would hold for strong matrix samplers as well, with proofs following similar arguments. However, for simplicity, we present our results in the non-strong case only.
2 Preliminaries
Notation. We use to represent set . For integer , is a random variable distributed uniformly over . For random variables and , we use to represent the statistical distance (total variation distance) between and is at most , i.e.,
We refer to an algorithm as “efficient” if it is polynomial-time computable. For simplicity, we omit domain sizes for samplers and matrix dimensions when context permits. Unless otherwise specified, statements such as “there exists a sampler” mean that for all , there exists a sampler with the stated properties.
Remark 10.
The condition is very mild and holds in nearly all applications. This requirement can be relaxed to for averaging samplers, and to for matrix samplers, where is an arbitrarily small positive constant. Such relaxations do not alter the results.
In the extreme case where for every constant , pairwise independence is already a near-optimal averaging sampler (see Lemma 12). Specifically, this yields an efficient strong sampler with samples, using only random bits. Similarly, for matrix samplers under the condition for all , pairwise independence also achieves near-optimal efficiency with samples and random bits.
2.1 Extractor-Based Sampler
As mentioned above, averaging samplers are equivalent to extractors. We will introduce this in detail in Section 5.1. Reingold, Vadhan, and Wigderson used this equivalence to achieve the following:
Theorem 11 ([31, Corollary 7.3], see also [18, Theorem 6.1]).
For every constant , there exists an efficient averaging sampler over with samples using random bits.
For ease of presentation, we often denote an extractor-based averaging sampler by , where is the -th output sample point of the sampler using randomness input . Therefore, the sample complexity of is .
2.2 Almost -wise Independence
A sequence is pairwise independent if the marginal distribution of every pair is uniformly random. Chor and Goldreich [12] proved that using pairwise independence, we can have a sampler using few random bits but with unsatisfying sample complexity.
Lemma 12 ([12]).
For all , there exists an efficient strong averaging sampler for domain sampler with samples using random bits.
Generalizing pairwise independence, an almost -wise independent sequence is a sequence of random variables such that the marginal distribution of every of them is close to uniform.
Definition 13 ([27]).
A sequence of random variables is said to be -almost -wise independent if for all subsets such that ,
In particular, the pairwise independent sequence mentioned above is a -almost -wise independent sequence. Naor and Naor proved that such sequences can be randomness-efficiently generated.
Lemma 14 ([3]).
There exists an efficient algorithm that uses random bits to generate a -almost -wise independent sequence .
Using standard techniques, we have the following concentration bound for almost -wise independent sequences (see Appendix B for the proof). Similar bounds for exact -wise independent sequences have been shown in [6, 13].
Lemma 15.
Let be a sequence of -almost -wise independent variables for an even integer . Then for every sequence of functions ,
2.3 Composition of Samplers
The idea of composing samplers has been studied before. More specifically, Goldreich proved the following proposition.
Proposition 16 ([18]).
Suppose we are given two efficient samplers:
-
A averaging sampler for domain with samples using random bits.
-
A averaging sampler for domain with samples using random bits.
Then, there exists an efficient averaging sampler for domain with samples using random bits.
2.4 Averaging Samplers Imply Matrix Samplers
When Wigderson and Xiao first introduced matrix samplers, they observed that an averaging sampler also functions as a matrix sampler with weaker parameters, though they did not provide a formal proof. We formalize this observation below:
Lemma 17.
A averaging sampler is a -dimensional matrix sampler.
The proof is presented in Appendix C.
3 Construction of Averaging Samplers
Our construction is based on a reduction lemma that constructs a sampler for domain based on a sampler for domain . We exploit the fact that when composing averaging samplers, the final sample complexity depends on only one of the samplers. Our strategy is:
-
Apply the extractor sampler in Theorem 11 as a sampler over domain . This uses random bits and generates samples.
-
By Proposition 16, we only need to design a averaging sampler over domain using random bits. The total sample complexity will be equal to the sample complexity of this sampler. For this sampler, we use almost -wise independence.
Following the idea of Proposition 16, we first prove that composing samplers maintains the properties of a strong sampler.
Lemma 18 (Strong Composition).
Suppose we are given two efficient averaging samplers:
-
Let be a sampler.
-
Let be a strong sampler.
Then, for uniformly random sources and ,
is an efficient strong averaging sampler for domain with samples using random bits.
Proof.
Let be an arbitrary sequence of functions, and define . Since is a averaging sampler and is bounded in , we have
Equivalently, we can express this as
| (1) |
For an arbitrary , view as a Boolean function on domain . Therefore, since are generated by a strong sampler,
| (2) |
By the triangle inequality and a union bound over equations (1) and (2), we have
This proves that the sampler we constructed is a strong averaging sampler.
Instantiating Lemma 18 with the extractor-based sampler from Theorem 11 gives:
Lemma 19 (Reduction Lemma).
For any : For a sufficiently large constant , suppose there exists an efficient averaging sampler for domain with samples using random bits. Then there exists an efficient averaging sampler Samp for domain with samples using random bits. Moreover, if is a strong sampler, then Samp is also strong.
Proof.
By Theorem 11, there exists an explicit averaging sampler with
when is large enough. Therefore, can work for domain .
This enables us to apply Lemma 18. The total number of random bits used is , and a total of samples are used.
Next, we show that for domain with , we can use an almost -wise independent sequence to design a strong averaging sampler with near-optimal sample complexity.
Lemma 20.
For any , there exists an efficient strong averaging sampler for domain with samples using random bits.
Proof.
We begin by setting , , and . We then define our sampler by outputting a -almost -wise independent sequence . Taking the parameters of Lemma 15, observe
and
Therefore, for every sequence of functions ,
Furthermore, Lemma 14 shows that we have an efficient algorithm that uses only
random bits to generate this -almost -wise independent sequence.
We remark that the construction in Lemma 20 can be replaced with a perfectly -wise independent sequence. This yields a slightly weaker sampler that uses random bits and samples. This looser construction is already sufficient to establish Theorem 1. However, our tighter construction in Lemma 20 applies more broadly, particularly when is small – for example, in Lemma 36.
Theorem 21.
For every constant , and for any , , and , there exists an efficient strong averaging sampler for domain that uses
random bits and samples.
Proof.
By Lemma 19, our goal is to design an efficient strong averaging sampler for domain for some large enough constant . By Lemma 20, for any , such sampler exists using samples and
Taking these into Lemma 19 gives us the desired bounds.
For an arbitrarily small constant , by setting in Theorem 21, we get Theorem 1 as a corollary: See 1
We can also set in Theorem 21 and get the following sampler with asymptotically optimal sample complexity but a worse randomness complexity.
Theorem 2. [Restated, see original statement.]
There exists an efficient strong averaging sampler for domain that uses random bits and samples.
4 Construction of Matrix Samplers
Before moving further, we note that non-explicitly, a good matrix sampler exists. This generalizes the non-explicit sampler given in [10], with the proof deferred to Appendix A.
Proposition 4. [Restated, see original statement.]
There exists a (non-explicit) -dimensional matrix sampler for domain using samples and random bits.
Our improved averaging sampler directly implies the best randomness-optimal matrix sampler to date. Applying Lemma 17 to our sampler in Theorem 1 gives:
Lemma 22.
For every constant , there exists an efficient -dimensional matrix sampler for domain using random bits and samples.
However, compared to the optimal sample complexity given in the non-explicit construction, our dependence on is exponentially worse. As is potentially very large, our goal is to utilize our composition to reduce the sample complexity while not increasing the randomness complexity too much.
4.1 One-Layer Composition
It is easy to verify that the composition lemma holds for matrices:
Lemma 23 (Matrix Composition).
Suppose we are given two efficient matrix samplers:
-
Let be a matrix sampler.
-
Let be a matrix sampler.
Then, for uniformly random sources and ,
is an efficient matrix sampler for domain with samples using random bits.
The proof is essentially the same as the proof of Lemma 18, since the triangle inequality applies to spectral norms, but since we are not dealing with the strong case we only have to do a union bound for two events, a bad sample from and a bad sample from .
The following lemma is the matrix version of Lemma 20, and we delay its proof to Section 4.4.
Lemma 24.
For any , there exists an efficient -dimensional matrix sampler for domain with samples using random bits.
Lemma 25.
Suppose we have an efficient -dimensional matrix sampler for domain with samples using bits. For any constant such that , we can construct an efficient -dimensional matrix sampler for domain with samples using bits.
Proof.
When , by setting in Lemma 24, we have a strong matrix sampler for domain with samples using
random bits. Then by Lemma 23, we have the theorem we want.
Theorem 4. [Restated, see original statement.]
For any constant , there exists an efficient -dimensional matrix sampler for domain that uses random bits and samples.
Proof.
4.2 Iterated Composition
Lemma 26.
There’s an efficient -dimensional matrix sampler for domain using random bits and samples.
Proof.
Let . We will prove that there exists a constant such that for each , there exists an efficient matrix sampler using at most random bits and samples, where
The case proves the lemma.
We will prove by induction on from to .
Base Case ().
By Lemma 22, there exists an efficient -dimensional matrix sampler using random bits and samples for some constants . When and , we have
and
Inductive Step.
Assume that for some , there exists an efficient matrix sampler using random bits and samples. By choosing some constant such that in Lemma 25, we have a matrix sampler using random bits and samples for some constants and . When and , we have
and
This finishes the induction and proves the lemma. Using Lemma 25, we have the following lemma:
Lemma 27.
For any constant : There exists an efficient -dimensional matrix sampler for domain using random bits and samples.
Proof.
4.3 Another Composition Scheme
To further reduce the number of random bits used in Lemma 27, we introduce another way of composing matrix samplers. Instead of sub-sampling the samples, we sample the random seeds here.
Proposition 8. [Restated, see original statement.]
Suppose we are given two efficient matrix samplers:
-
Let be a matrix sampler.
-
Let be a averaging sampler.
Then, for uniformly random sources ,
is an efficient matrix sampler for domain with samples using random bits.
Proof.
Let be a function such that for all . We define as follows:
Then . One good property of is that
For the output of , we have
Therefore, with probability,
Then we have
This show that, with probability , the error of Samp is at most . See 3
Proof.
We apply Proposition 8 with the following choices:
-
: Use the matrix sampler for domain in Lemma 27 by choosing . This uses random bits and samples.
-
: Use the averaging sampler for domain in Theorem 1 by choosing . This uses random bits and samples.
This gives as an efficient matrix sampler using random bits and samples.
Set and in Lemma 25 and apply it to this sampler will reduce the sample complexity to for any constant .
4.4 Proof of Lemma 24
4.4.1 Concentration of Random Matrices
The goal of this section is to prove Lemma 30, an analog of Lemma 15 for random Hermitian matrices. Our approach follows standard techniques from the random matrix literature [36, 11, 25].
Lemma 28.
Let be a sequence of independent, mean-zero, self-adjoint random matrices of size such that for every . Then, for every even interger , we have
where the sequence consists of independent Rademacher random variables.
Proof.
Note that
When some index appears an odd number of times in the word , the corresponding term vanishes. Hence we may restrict the sum to
and we have
where the last inequality comes from the fact that always holds for every .
It remains to bound . Since each index appearing in must occur an even number of times, one can group the positions into exactly unordered pairs so that the two entries in each pair carry the same index. The number of ways to pair is bounded by . For each pairing there are choices of index per pair. Therefore,
Therefore,
which proves the lemma.
Using a standard symmetrization trick, we get the following lemma, which is an analog of Proposition 39:
Lemma 29.
Let be a sequence of independent, mean-zero, self-adjoint random matrices of size such that for every . Then, for every even interger , we have
Proof.
Let us write
where is an independent copy of , independent also of the original ’s. Since each has mean zero, we have
where the last inequality follows from Jensen’s inequality.
Next, observe that
Introduce a fresh sequence of Rademacher random variables , independent of everything else. Then, conditional on , the random matrix
has the same distribution as . Consequently,
Note that for each fixed ,
Taking , then , and using that has the same law as gives
In combination with the symmetrization bound , this yields
Taking in the bound in Lemma 28, we have
Lemma 30.
Let be a sequence of -almost -wise independent variables for a positive even integer . Then for any such that for all , is Hermitian and , we have
Proof.
Let . We have
Note that each is always a Hermitian matrix, so their sum is always Hermitian and therefore normal. Then we have
Moreover, since is a positive even integer, is positive semidefinite. Therefore,
Let be a sequence of independent random variables where .
4.4.2 Almost -wise independence for small domains
We first prove that the concentration analysis for Hermitian matrices directly implies the general case.
Lemma 31.
Let be a function. Suppose for any such that for all , is Hermitian and ,
Then Samp is a -dimensional matrix sampler.
Proof.
Let . We are going to prove that for any function such that for all , we have
For any matrix , its Hermitian dilation is defined by
It is easy to verify that . Then, for function , we have
Note that we have
Hence,
Proof.
We begin by setting , , and . We then define our sampler by outputting a -almost -wise independent sequence . Taking the parameters of Lemma 30, observe
and
Let be a function. Suppose for any such that for all , is Hermitian and ,
Then Samp is a -dimensional matrix sampler by Lemma 31. Therefore, by Lemma 30, we have
Our sampler uses
samples. Furthermore, Lemma 14 shows that we have an efficient algorithm that uses only
random bits to generate this -almost -wise independent sequence.
5 Applications to Extractors and Codes
5.1 Applications to Extractors
Zuckerman showed that averaging samplers are equivalent to randomness extractors [43]. Here we state the only direction that we need.
Lemma 33 ( [43]).
An efficient strong averaging sampler gives an efficient strong extractor .
Theorem 5. [Restated, see original statement.]
For every constant , there exists constant such that for all and , there is an efficient strong extractor with and .
Proof.
By Theorem 1, for any positive constant , there exists a constant such that there exists an efficient strong averaging sampler for domain with samples using random bits.
To construct the required strong extractor for every , we set such that . Then, we construct an efficient strong sampler Samp for domain where
By the above, Samp uses random bits and generates samples.
By Lemma 33, Samp implies an efficient strong extractor with . It is only left to verify that for some constant . We have
This proves the theorem.
Proof.
By Theorem 2, there exists a constant such that there exists an efficient strong averaging sampler for domain with samples using random bits.
To construct the required strong extractor for every , we set such that . Then, we construct an efficient strong sampler Samp for domain where
By the above, Samp uses random bits and generates samples.
By Lemma 33, Samp implies an efficient strong extractor with . It is only left to verify that for some constant . We have
This proves the theorem.
5.2 Application to List-Decodable Codes
Error-correcting codes are combinatorial objects that enable messages to be accurately transmitted, even when parts of the data get corrupted. Codes have been extensively studied and have proven to be extremely useful in computer science. Here we focus on the combinatorial property of list-decodability, defined below.
Definition 34.
A code is list-decodable if for every received message , there are at most messages such that , where denotes the Hamming distance. A code is binary if .
We focus on the binary setting, i.e., .
Lemma 35 ([35]).
An efficient strong averaging sampler over the binary domain gives an efficient binary code that is list-decodable with code rate .
To construct our codes, we will use our almost -wise independence sampler in Lemma 20 directly.
Lemma 36.
For all constant , there exists an efficient strong averaging sampler for binary domain with samples using random bits for some constant .
Proof.
Proof.
6 Open Problems
Our work raises interesting open problems.
-
Comparing to the sampler in [31] which uses random bits, our averaging sampler requires random bits. Can we improve our randomness efficiency while maintaining a good sample complexity?
-
Is there a way to eliminate the additional in the sample complexity? For and , can we design an efficient averaging sampler that is asymptotically optimal in both randomness and sample complexity?
-
Can we further improve the randomness complexity of our matrix samplers to fully resolve Problem 2?
-
Is it possible to reduce the list size of the list-decodable codes in Theorem 7 to using the structure of the list?
-
Can we construct randomness-efficient -samplers on other normed spaces ?
References
- [1] Rohit Agrawal. Samplers and Extractors for Unbounded Functions. In Dimitris Achlioptas and László A. Végh, editors, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2019), volume 145 of Leibniz International Proceedings in Informatics (LIPIcs), pages 59:1–59:21, Dagstuhl, Germany, 2019. Schloss Dagstuhl – Leibniz-Zentrum für Informatik. doi:10.4230/LIPIcs.APPROX-RANDOM.2019.59.
- [2] Rudolf Ahlswede and Andreas Winter. Strong converse for identification via quantum channels. IEEE Transactions on Information Theory, 48(3):569–579, 2002. doi:10.1109/18.985947.
- [3] Noga Alon, Oded Goldreich, Johan Håstad, and René Peralta. Simple constructions of almost k-wise independent random variables. Random Structures & Algorithms, 3(3):289–304, 1992. doi:10.1002/RSA.3240030308.
- [4] Sanjeev Arora, Carsten Lund, Rajeev Motwani, Madhu Sudan, and Mario Szegedy. Proof verification and the hardness of approximation problems. J. ACM, 45(3):501–555, May 1998. doi:10.1145/278298.278306.
- [5] Mihir Bellare, Oded Goldreich, and Shafi Goldwasser. Randomness in interactive proofs. Computational Complexity, 3:319–354, 1993. doi:10.1007/BF01275487.
- [6] Mihir Bellare and John Rompel. Randomness-efficient oblivious sampling. In Proceedings 35th Annual Symposium on Foundations of Computer Science, pages 276–287. IEEE, 1994. doi:10.1109/SFCS.1994.365687.
- [7] Charles H Bennett, Gilles Brassard, and Jean-Marc Robert. Privacy amplification by public discussion. SIAM journal on Computing, 17(2):210–229, 1988. doi:10.1137/0217014.
- [8] Jarosław Błasiok. Optimal streaming and tracking distinct elements with high probability. ACM Transactions on Algorithms (TALG), 16(1):1–28, 2019. doi:10.1145/3309193.
- [9] Ran Canetti, Yevgeniy Dodis, Shai Halevi, Eyal Kushilevitz, and Amit Sahai. Exposure-resilient functions and all-or-nothing transforms. In Advances in Cryptology – EUROCRYPT 2000, International Conference on the Theory and Application of Cryptographic Techniques, Bruges, Belgium, May 14-18, 2000, Proceedings, pages 453–469. Springer, 2000. doi:10.1007/3-540-45539-6_33.
- [10] Ran Canetti, Guy Even, and Oded Goldreich. Lower bounds for sampling algorithms for estimating the average. Information Processing Letters, 53(1):17–25, 1995. doi:10.1016/0020-0190(94)00171-T.
- [11] Richard Y Chen, Alex Gittens, and Joel A Tropp. The masked sample covariance estimator: an analysis using matrix concentration inequalities. Information and Inference: A Journal of the IMA, 1(1):2–20, 2012.
- [12] Benny Chor and Oded Goldreich. On the power of two-point based sampling. Journal of Complexity, 5(1):96–106, 1989. doi:10.1016/0885-064X(89)90015-0.
- [13] Yevgeniy Dodis. PhD thesis: Exposure-resilient cryptography. Massachusetts Institute of Technology, September 2000.
- [14] Yevgeniy Dodis and Joel Spencer. On the (non)universality of the one-time pad. In 43rd Symposium on Foundations of Computer Science (FOCS 2002), 16-19 November 2002, Vancouver, BC, Canada, Proceedings, pages 376–385. IEEE Computer Society, 2002.
- [15] Yevgeniy Dodis and Daniel Wichs. Non-malleable extractors and symmetric key cryptography from weak secrets. In Proceedings of the 41st Annual ACM Symposium on Theory of Computing, STOC 2009, Bethesda, MD, USA, May 31–June 2, 2009, pages 601–610. ACM, 2009. doi:10.1145/1536414.1536496.
- [16] Ankit Garg, Yin Tat Lee, Zhao Song, and Nikhil Srivastava. A matrix expander chernoff bound. In Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing, pages 1102–1114, 2018. doi:10.1145/3188745.3188890.
- [17] David Gillman. A chernoff bound for random walks on expander graphs. SIAM Journal on Computing, 27(4):1203–1220, 1998. doi:10.1137/S0097539794268765.
- [18] Oded Goldreich. A sample of samplers: A computational perspective on sampling. In Studies in Complexity and Cryptography. Miscellanea on the Interplay between Randomness and Computation: In Collaboration with Lidor Avigad, Mihir Bellare, Zvika Brakerski, Shafi Goldwasser, Shai Halevi, Tali Kaufman, Leonid Levin, Noam Nisan, Dana Ron, Madhu Sudan, Luca Trevisan, Salil Vadhan, Avi Wigderson, David Zuckerman, pages 302–332. Springer, 2011. doi:10.1007/978-3-642-22670-0_24.
- [19] Venkatesan Guruswami and Atri Rudra. Explicit codes achieving list decoding capacity: Error-correction with optimal redundancy. IEEE Transactions on Information Theory, 54(1):135–150, 2008. doi:10.1109/TIT.2007.911222.
- [20] Johan Håstad, Russell Impagliazzo, Leonid A Levin, and Michael Luby. A pseudorandom generator from any one-way function. SIAM Journal on Computing, 28(4):1364–1396, 1999. doi:10.1137/S0097539793244708.
- [21] Russell Impagliazzo and David Zuckerman. How to recycle random bits. In FOCS, volume 30, pages 248–253, 1989. doi:10.1109/SFCS.1989.63486.
- [22] Jørn Justesen. Class of constructive asymptotically good algebraic codes. IEEE Transactions on information theory, 18(5):652–656, 1972. doi:10.1109/TIT.1972.1054893.
- [23] Yael Tauman Kalai, Xin Li, and Anup Rao. 2-source extractors under computational assumptions and cryptography with defective randomness. In 50th Annual IEEE Symposium on Foundations of Computer Science, FOCS 2009, October 25-27, 2009, Atlanta, Georgia, USA, pages 617–626. IEEE Computer Society, 2009. doi:10.1109/FOCS.2009.61.
- [24] Yael Tauman Kalai, Xin Li, Anup Rao, and David Zuckerman. Network extractor protocols. In 49th Annual IEEE Symposium on Foundations of Computer Science, FOCS 2008, October 25-28, 2008, Philadelphia, PA, USA, pages 654–663. IEEE Computer Society, 2008. doi:10.1109/FOCS.2008.73.
- [25] Michel Ledoux and Michel Talagrand. Probability in Banach Spaces: isoperimetry and processes. Springer Science & Business Media, 2013.
- [26] Chin-Laung Lu. Hyper-encryption against space-bounded adversaries from on-line strong extractors. In Advances in Cryptology - CRYPTO 2002, 22nd Annual International Cryptology Conference, Santa Barbara, California, USA, August 18-22, 2002, Proceedings, pages 257–271. Springer, 2002. doi:10.1007/3-540-45708-9_17.
- [27] Joseph Naor and Moni Naor. Small-bias probability spaces-efficient constructions and applications. SIAM Journal on Computing, 22(4):838–856, 1993. doi:10.1137/0222053.
- [28] Noam Nisan and David Zuckerman. Randomness is linear in space. Journal of Computer and System Sciences, 52(1):43–52, 1996. doi:10.1006/JCSS.1996.0004.
- [29] Jaikumar Radhakrishnan and Amnon Ta-Shma. Bounds for dispersers, extractors, and depth-two superconcentrators. SIAM Journal on Discrete Mathematics, 13(1):2–24, 2000. doi:10.1137/S0895480197329508.
- [30] Ran Raz. Extractors with weak random seeds. In Proceedings of the thirty-seventh annual ACM symposium on Theory of computing, pages 11–20, 2005. doi:10.1145/1060590.1060593.
- [31] Omer Reingold, Salil Vadhan, and Avi Wigderson. Entropy waves, the zig-zag graph product, and new constant-degree expanders and extractors. In Proceedings 41st Annual Symposium on Foundations of Computer Science, pages 3–13. IEEE, 2000. doi:10.1109/SFCS.2000.892006.
- [32] Yao-Feng Ren and Han-Ying Liang. On the best constant in Marcinkiewicz–Zygmund inequality. Statistics & probability letters, 53(3):227–233, 2001.
- [33] Mark Rudelson. Random vectors in the isotropic position. Journal of Functional Analysis, 164(1):60–72, 1999.
- [34] Aravind Srinivasan and David Zuckerman. Computing with very weak random sources. SIAM Journal on Computing, 28(4):1433–1459, 1999. doi:10.1137/S009753979630091X.
- [35] Amnon Ta-Shma and David Zuckerman. Extractor codes. IEEE transactions on information theory, 50(12):3015–3025, 2004. doi:10.1109/TIT.2004.838377.
- [36] Nicole Tomczak-Jaegermann. The moduli of smoothness and convexity and the rademacher averages of the trace classes (). Studia Mathematica, 50(2):163–182, 1974. URL: http://eudml.org/doc/217886.
- [37] Joel A Tropp. User-friendly tail bounds for sums of random matrices. Foundations of computational mathematics, 12:389–434, 2012. doi:10.1007/S10208-011-9099-Z.
- [38] Joel A Tropp et al. An introduction to matrix concentration inequalities. Foundations and Trends® in Machine Learning, 8(1-2):1–230, 2015. doi:10.1561/2200000048.
- [39] Salil Vadhan. The unified theory of pseudorandomness: guest column. ACM SIGACT News, 38(3):39–54, 2007. doi:10.1145/1324215.1324225.
- [40] Salil P. Vadhan. On constructing locally computable extractors and cryptosystems in the bounded storage model. In Advances in Cryptology - CRYPTO 2003, 23rd Annual International Cryptology Conference, Santa Barbara, California, USA, August 17-21, 2003, Proceedings, pages 61–77. Springer, 2003. doi:10.1007/978-3-540-45146-4_4.
- [41] Salil P. Vadhan. Pseudorandomness. Foundations and Trends® in Theoretical Computer Science, 7(1–3):1–336, 2012. doi:10.1561/0400000010.
- [42] Avi Wigderson and David Xiao. A randomness-efficient sampler for matrix-valued functions and applications. In 46th Annual IEEE Symposium on Foundations of Computer Science (FOCS’05), pages 397–406. IEEE, 2005. doi:10.1109/SFCS.2005.8.
- [43] David Zuckerman. Randomness-optimal oblivious sampling. Random Structures & Algorithms, 11(4):345–367, 1997. doi:10.1002/(SICI)1098-2418(199712)11:4\%3C345::AID-RSA4\%3E3.0.CO;2-Z.
- [44] David Zuckerman. Linear degree extractors and the inapproximability of max clique and chromatic number. Theory of Computing, 3(6):103–128, 2007. doi:10.4086/toc.2007.v003a006.
Appendix A Proof of Proposition 4
In this section, we extend the non-explicit averaging sampler construction from [10] to matrix samplers.
Lemma 37.
Let Samp be a -dimensional matrix sampler for domain using samples. Then there exists a -dimensional matrix sampler for domain using random bits and samples.
Proof.
Our goal is to construct a matrix sampler based on Samp.
Output Approximation via Discretization.
We construct a discretization grid by
where . For each , define an approximation function that rounds each entry of to the nearest point in , yielding . Since each entry in differs from by at most , the total approximation error per matrix (in spectral norm) is bounded by according to Proposition 40. Thus, has an average that approximates the average of within , and the set of all such approximations forms a finite function class, which we denote .
Bounding the Size of .
Each entry of a matrix in has at most possible values. The number of possible matrices is therefore bounded by
so the total number of functions in is
Probabilistic Reduction of Random Bits.
For each function , let a random seed be called bad if the estimate of Samp deviates from the true average of by more than . Since Samp is a -sampler, the fraction of bad random seeds for any is at most . By Hoeffding’s inequality, if we select random seeds independently at random, the probability that more than of them are bad is at most . Applying a union bound over all , the probability that there exists any with more than bad seeds is at most .
Choosing and Applying Probabilistic Method.
Set
so that . With this choice, there exists a set of random seeds such that, for all , the fraction of bad seeds in is at most . The number of random bits required to select a sequence is
Defining the New Sampler .
We define as follows: Select a random seed , and run to get samples . With probability, we have
is then a matrix sampler, using only
random bits. This completes the proof.
Theorem 38 (Matrix Chernoff Bound, see [38]).
Let be independent complex random matrices. Suppose for all . Then, for any , the following inequality holds:
Applying matrix chernoff bound, we can prove Proposition 4.
Proposition 4. [Restated, see original statement.]
There exists a (non-explicit) -dimensional matrix sampler for domain using samples and random bits.
Proof.
By Theorem 38, taking independent samples in would give a -dimensional matrix sampler for domain using samples and random bits. Applying Lemma 37, we get a -dimensional matrix sampler for domain using samples and random bits.
Appendix B Proof of Lemma 15
Proposition 39 (Marcinkiewicz–Zygmund inequality [32]).
Let be a sequence of independent random variables with , . Then for :
where .
Lemma 15. [Restated, see original statement.]
Let be a sequence of -almost -wise independent variables for an even integer . Then for every sequence of functions ,
Proof.
Let . We have
Let be a sequence of independent random variables where . Since the ’s are -almost -wise independent and , we have
Since and , they satisfy the conditions for Marcinkiewicz–Zygmund inequality. We have
Therefore,
Appendix C Proof of Lemma 17
To prove Lemma 17, we need the following property of matrix norms:
Proposition 40.
Let and define
Then the spectral norm of satisfies
Proof.
Select standard basis vectors such that . Then,
We also have
Lemma 17. [Restated, see original statement.]
A averaging sampler is a -dimensional matrix sampler.
Proof.
Let be the sampler’s output. We define
Now we fix some . For all , we have by Proposition 40. Then, since are the output of a averaging sampler, we have
where and are the functions that extract the real part and imaginary part of respectively. Take a union bound, we have with probability,
By a union bound over all entries, with probability, all entries have an additive error bounded by , and this implies that by Proposition 40.
