Abstract 1 Introduction 2 Preliminaries 3 PRGs for Uniform Classes with Near-Optimal Seed Length 4 Construction of Random Strings and Applications 5 Barriers to Improvements References Appendix A Proof of Lemma 14

How to Construct Random Strings

Oliver Korten ORCID Department of Computer Science, Columbia University, New York, NY, USA Rahul Santhanam ORCID Department of Computer Science, Oxford University, UK
Abstract

We address the following fundamental question: is there an efficient deterministic algorithm that, given 1n, outputs a string of length n that has polynomial-time bounded Kolmogorov complexity Ω~(n) or even no(n)?

Under plausible complexity-theoretic assumptions, stating for example that there is an ϵ>0 for which 𝖳𝖨𝖬𝖤[T(n)]𝖳𝖨𝖬𝖤𝖭𝖯[T(n)ϵ]/2ϵn for appropriately chosen time-constructible T, we show that the answer to this question is positive (answering a question of [27]), and that the Range Avoidance problem [18, 20, 27] is efficiently solvable for uniform sequences of circuits with close to minimal stretch (answering a question of [14]).

We obtain our results by giving efficient constructions of pseudo-random generators with almost optimal seed length against algorithms with small advice, under assumptions of the form mentioned above. We also apply our results to give the first complexity-theoretic evidence for explicit constructions of objects such as rigid matrices (in the sense of Valiant) and Ramsey graphs with near-optimal parameters.

Keywords and phrases:
Explicit Constructions, Kolmogorov Complexity, Derandomization
Copyright and License:
[Uncaptioned image] © Oliver Korten and Rahul Santhanam; licensed under Creative Commons License CC-BY 4.0
2012 ACM Subject Classification:
Theory of computation Complexity theory and logic
Acknowledgements:
The authors would like to thank Hanlin Ren for many useful discussions.
Editors:
Srikanth Srinivasan

1 Introduction

1.1 Motivation

Constructing Random Strings.

Time-bounded Kolmogorov complexity 𝖪T [22] is a fundamental measure of complexity of a Boolean string. Given a string x and a time bound T, 𝖪T(x) is the size of the smallest program p such that 𝒰(p) outputs x within T steps, where 𝒰 is a universal Turing machine fixed in advance. Intuitively, for feasible time bounds T(n), 𝖪T(n)(x) measures the inherent compressibility or “structure” in x from the point of view of efficient algorithms, in the sense that any x with low 𝖪T(n) complexity has a short description from which it can be recovered efficiently. Thus a string with high 𝖪T complexity can be considered unstructured or random-like.

Given functions T,: and a sequence (xn)n of strings, where each xn is of length n, let us call the sequence (𝖪T,)-hard if 𝖪T(n)(xn)(n) for all n. We are interested in the complexity of producing a (𝖪T,)-hard sequence of strings, for polynomial time bounds T and for as large as possible. Three particularly important settings we focus on here will be (n)=nΩ(1), (n)=Ω~(n), and (n)=no(n). A simple argument shows that 𝖪T-hard strings cannot be produced in time T=o(T/log(T)) by an algorithm A which outputs xn when given n in unary. Indeed, the existence of such an algorithm would imply that KT(xn)=O(log(n)), as the universal machine 𝒰 can output xn in less than time T when given n in binary and a constant-size program for 𝒜 111Here we are using the standard fact that any algorithm 𝒜 running in time T can be simulated by the universal machine 𝒰 in time O(Tlog(T)).. But is it possible for such a sequence to be produced by an algorithm running in time poly(T)?
Question 1: Let T: be any time-constructible function and let (n){nΩ(1),Ω~(n),no(n)}. Is there a deterministic algorithm 𝒜, which given input 1n, runs in time poly(T) and outputs a (𝖪T,)-hard string?
Question 1 is a fundamental question about Kolmogorov complexity, for which we currently do not have clear positive or negative complexity-theoretic evidence. A positive answer to Question 1 would be very interesting, as it would demonstrate the power of having polynomially more time: a poly(T) time algorithm would be capable of producing highly T-incompressible strings, while a o(T/log(T)) time algorithm can only produce strings that are highly T-compressible. Indeed Hypothesis 1.19 in a recent work of [27] asks a very similar question222The difference is that instead of only asking about producing 𝖪T-hard strings for time-constructible functions T, [27] asks for an algorithm that gets T and n as input in unary, and outputs a 𝖪T-hard string of length n. to Question 1, and states that “The plausibility of Hypothesis 1.19 remains to be investigated”. Question 1 also has implications for the theory of meta-complexity, which has seen much recent work [2, 13, 24].

A natural approach to Question 1 is via pseudo-randomness. Under the assumption that 𝖤 requires non-deterministic circuits of size 2ϵn for some constant ϵ>0, it is known [15, 19] that for any constant c>0 there is a pseudo-random generator Gn with seed length O(log(n)) and error o(1) against co-non-deterministic circuits of size nc, computable in time poly(n). Consider T(n)=nd, where d<c is any constant. The property of being a 𝖪T-incompressible string, i.e., a length-n string of 𝖪T-complexity n1, can be checked by a co-non-deterministic circuit of size at most nc, and moreover at least half of all strings of length n satisfy the property. Hence close to half of the outputs of Gn satisfy the property of 𝖪T-incompressibility, which means that the range of Gn is a poly(n) size set Sn computable in time poly(n) and containing at least one 𝖪T-incompressible string for large enough n. However, it is unclear how to identify efficiently which of the strings in Sn is incompressible - this seems to require calls to an 𝖭𝖯 oracle. We could try and combine the strings in Sn together into a string that is 𝖪t-hard, for example by concatenating them. This approach has two drawbacks:

  1. 1.

    Concatenating the strings yields a new string of length >>T(n) rather than length n, since the set Sn has size >>nc. Hence for any reasonable setting of (n), even (n)=nΩ(1), we do not get any interesting consequences for KT-hardness when we are interested in time bounds as a function of input length.

  2. 2.

    Even if we don’t mind destroying the relation between the time bound and input length, concatenating poly(n) strings which are maximally incompressible can, at best, achieve incompressibility (n)=nΩ(1). If we are in the more stringent compressibility settings (n)=Ω~(n) or (n)=no(n) which occur more frequently in explicit construction applications, concatenating poly(n) many strings appears to be useless.

Like most of the interesting problems in derandomization, answering Question 1 unconditionally would be difficult, as any (𝖪T,nΩ(1))-hard string for T>n2 can be transformed easily into the truth table of a hard Boolean function in 𝖤, i.e., a function with circuit complexity at least 2ϵm for some ϵ, and hence would imply 𝖤𝖯/poly. Instead, we would simply like complexity-theoretic evidence in favor of or against Question 1. Preferably, this evidence should avoid exotic assumptions, and involve standard beliefs about the difficulty of simulating one resource efficiently by another, e.g., the belief that time cannot be simulated by much smaller space or by much smaller non-uniformity.

Explicit Constructions and Range Avoidance.

Recently there has been a lot of interest [20, 18, 27, 12, 14, 11, 10, 7, 8, 5, 23, 21] in the Range Avoidance problem, where the input is a circuit C from bits to n bits, where n>, and the task is to find a non-output of C. This task is efficiently doable with randomness - we can just output a random string of length n, and this will be a non-output of C with probability at least 1/2. The question is whether this can be done efficiently deterministically.

The problem was motivated in [20, 18] by its applications to explicit constructions of combinatorial objects. Let Π be a property, such as the property of a graph being Ramsey or of a matrix being rigid, that is satisfied by a random object with probability very close to 1. For many natural such Π, including the Ramsey and Rigidity properties, it can be shown that objects not satisfying Π can be efficiently recovered from a compressed representation. This recovery process can be modeled by a small Boolean circuit C for which any non-output is the representation of an object satisfying Π. In other words, solving Range Avoidance on C enables an efficient explicit construction for Π.

Given that explicit constructions for properties such as Ramsey and Rigidity have been long sought after, this motivates the study of the complexity of Range Avoidance, and in particular the search for evidence that Range Avoidance is feasible. Unfortunately it was shown in [14] that under plausible cryptographic and complexity-theoretic assumptions, Range Avoidance is intractable.

However the intractability of Range Avoidance doesn’t give evidence for the intractability of the explicit construction problems mentioned above - it could be that a weaker assumption than solving Range Avoidance is sufficient for efficient explicit constructions. Indeed such an assumption was identified in [14] - solving Range Avoidance for uniformly generated circuits. It turns out that all of the explicit construction problems studied in [20] can be captured by Range Avoidance on uniform circuits. This now raises the question of how tractable it is to solve Range Avoidance in this special case.
Question 2: Is there compelling complexity-theoretic evidence in favour of efficient explicit constructions for properties such as Ramsey and Rigidity, and more generally in favour of the tractability of Range Avoidance on uniform circuits?
Indeed the tractability of Range Avoidance on uniform circuits is raised explicitly in [14]. It is also stated there that “some of the authors are skeptical that this special case of Avoid is easy.”

The pseudorandomness approach to Question 1 can be applied to Question 2 as well. Essentially the same argument as we used there gives that, for properties such as Ramsey and Rigid, under standard derandomization assumptions, there is an efficient construction of a list of objects at least one of which satisfies the property. However, as we do not know polynomial-time algorithms for verifying the Ramsey or Rigidity properties, it is unclear how to pick out a single object satisfying the property from such a list. Indeed the pseudo-randomness approach applies to general Range Avoidance as well, but in that case we now have evidence that the problem is not tractable.

1.2 Our Results

We give positive answers to Questions 1 and 2, under believable complexity assumptions. Our hardness assumptions are of the following two forms, each involving a parameter d:

Hypothesis (Hardness Assumptions for dth-Exponential Time Bounds).
  1. 1.

    (Strong Form) There is an absolute constant ϵ>0 so that the following holds. For any time constructible T(n),m(n), with T(n)n having at most dth-order exponential growth rate and nm(n)poly(n), there is a function f:{0,1}, |f(n)|=m(n) computable in poly(T(n)) time, which cannot be computed in time T(n)ϵ with m(n)ϵ bits of advice by any 𝖭𝖯-oracle machine for more than finitely many n.

  2. 2.

    (Weak Form) Let T(n)=exp[d+1](n). There is some ϵ>0 and a language computable in time T(n) which is not computable in time T(n)ϵ with an 𝖭𝖯 oracle and 2ϵn bits of advice even infinitely often.

We are using exp[d](n) for the d-fold iteration of the exponential function exp(n)=2n. Such assumptions, particularly of the second kind, are fairly standard in complexity theory. Indeed, when d=0, the second assumption is a standard derandomization assumption, asserting that there is a language in 𝖳𝖨𝖬𝖤[2n] which does not have 𝖭𝖯 oracle circuits of size 2o(n). The main novelty in our case is that we use these assumptions for larger values T, in particular for time bounds T which have iterated-exponential growth rate. While we state the strong form assumption by quantifying over all time constructible T(n),m(n), in fact we only require the assumption to hold for quite a limited class of bounds: the bounds T(n) needed are representable by constant length arithmetic expressions with exp,log, and ceiling/floor functions, while the bounds m(n) needed are simply of the form m(n)=exp(clogn) for constants c.

The first assumption is analogous to assumptions made in the recent literature on hardness vs randomness [6], and generalizes the second assumption in a natural way. Both the first and second assumptions involve natural beliefs about the relationships between the fundamental resources of time, non-determinism and advice. Both assumptions formalize the intuition that time cannot be sped up by an arbitrary polynomial amount using non-determinism and advice. Indeed, note that there is no known way to speed up time even by a super-constant amount by using non-determinism and advice.

We believe that our assumptions hold relative to a random oracle, and here is an informal argument for the case of the second assumption. Given oracle A, define the time T bounded Turing machine MA to accept x of length n if the x’th string of length T(n) is in A (where we consider the lexicographic order on strings). For a random oracle A, the truth table of A at length T(n) is Kolmogorov-random even conditioned on truth tables below length T(n). And T(n)ϵ time non-deterministic machines can only access strings of A at length T(n)ϵ or below, so it should not be possible to compute the first 2n bits of A at length T(n) from this information together with arbitrary T(n)ϵ bits of advice.

Our approach to using these assumptions in service of constructing random strings is via pseudo-randomness, and is a variation on the strategy we critiqued a few paragraphs ago: construct a pseudorandom generator, enumerate its range, and concatenate the constituent strings. The problems with this approach discussed in the last section boiled down to one issue: for a standard complexity-theoretic PRG fooling polynomial size nondeterministic circuits, we can at best achieve a seed length of O(logn), and hence will end up needing to concatenate a very long list of poly(n) strings. To remedy the situation, it would suffice to use pseudorandom generators with much smaller seed length of o(logn) or even loglogn. While it is known that O(logn) seed length is information-theoretically optimal for fooling nonuniform circuits of size poly(n), we aim to exploit the uniformity of the adversary we are fooling to get seed length as small as possible; for uniform adversaries the lower bound of logn no longer applies. This yields a result which is interesting in its own right: efficient constructions of small pseudo-random sets for uniform (or even slightly non-uniform) algorithms, under assumptions of the form discussed in the previous paragraph.

Theorem 1 (Informal, see Theorems 16 and 18).

Under the Hypotheses above with d=1, there is a pseudorandom generator with seed length O(loglogn) which fools 𝖳𝖨𝖬𝖤𝖭𝖯[poly(n)]; if we assume the strong form the generator runs in polynomial time, and if we assume the weak form it runs in quasipolynomial time.

More generally, under the above hardness assumption with large parameters d (i.e. for higher-exponential time bounds), we obtain pseudorandom generators fooling 𝖳𝖨𝖬𝖤𝖭𝖯[poly(n)] with seed length O(log[d+1](n)) (for infinitely many n). If we assume the strong form hypothesis our generators will run in polynomial time, and if we assume the weak form they will run in iterated quasipolynomial time. When d>1, our construction will require in addition that the input length n is of the form n=exp[d1](n) for some n, or more generally that 𝖪poly(n)(n)log[d](n).

Some terms in the above require clarification. We are using log[d]() to denote the d-fold iteration of the logarithmic function. Our notion of “iterated quasipolynomial time” will be defined in the preliminaries - it describes a family of runtime bounds which grows faster than polynomial and quasipolynomial, but is much closer to polynomial than to exponential. Finally, the notation 𝖪poly(n)(n) denotes the time bounded Kolmogorov complexity of the number n, when written as a string in standard binary notation.

Our PRG construction can actually handle advice (nonuniformity) up to log[d](n) when the seed length is log[d+1](n), which we observe is optimal via a standard argument (Lemma 22). It also holds for fooling larger time bounds 𝖳𝖨𝖬𝖤𝖭𝖯[T(n)] for T(n)>>poly(n) (at the cost of the time complexity of the generator being low as a function of T(n) rather than of n), and for other oracles 𝒪 in place of the 𝖭𝖯 oracle (provided we use 𝒪 in place of 𝖭𝖯 in the hardness assumption).

We then turn our attention briefly to PRGs fooling uniform algorithms without 𝖭𝖯 oracles: what is the minimal seed length such that we can fool 𝖳𝖨𝖬𝖤[T(n)] by a PRG with running time T(n)O(1)? For this problem we show in Theorem 20 that the machinery developed above is unnecessary: if the PRG is capable of simulating the distinguishers it is trying to fool, there is a straighforward method which can reduce seed length O(logn) to arbitrarily small seed length. Hence under the standard assumptions that achieve logarithmic seed length for 𝖳𝖨𝖬𝖤[T(n)]/n (namely that 𝖤 requires exponential circuit size) we can obtain an arbitrarily strong reduction in seed length for 𝖳𝖨𝖬𝖤[T(n)]. Like our 𝖭𝖯-oracle result, this result applies to distinguishers with mild nonuniformity, and the achievable seed length scales with the uniformity in a way that is provably optimal according to Lemma 22.

Random Strings and Explicit Constructions.

We next use our short-seed pseudorandom generators for 𝖯𝖭𝖯 (Theorem 1) to give (conditional) constructions of 𝖪poly-random strings. Our first construction of random strings is the following, which achieves incompressibility Ω~(n):

Theorem 2 (Informal, see Theorems 23 and 24).

Under our main hardness assumption with d=1, for any k there is an algorithm 𝒜 such that 𝒜(1n) outputs an n-bit string x satisfying 𝖪nk(x)n𝗉𝗈𝗅𝗒𝗅𝗈𝗀(n) for all n; the algorithm will run in polynomial time if we use the strong form of the assumption, and quasipolynomial time if we use the weak form.

More generally under our main hardness assumption for larger values of d, we obtain constructions of strings with 𝖪nk(x)npoly(log[d](n)). Using the strong form of the assumption the construction runs in polynomial time, and under the weak form it runs in iterated quasipolynomial time. When d>1 we require the integer n to be of the form n=exp[d1](n) for some integer n, or more generally to satisfy 𝖪poly(n)(n)log[d]n.

If we are satisfied with our construction algorithms succeeding on some unknown infinite set of input lengths, we are able to bootstrap the above constructions so that the log[d](n) loss in 𝖪-complexity becomes additive rather than multiplicative:

Theorem 3 (Informal, see Theorem 25).

Assume the first version of our main hardness assumption for all d. Then for any k,d there is a polynomial time algorithm 𝒜 such that 𝒜(1n) outputs an n-bit string x satisfying 𝖪nk(x)nlog[d]n for infinitely many n.

In Section 4.2 we then use the previous results to give conditional polynomial time explicit constructions of various important combinatorial objects shown reducible to Range Avoidance in [20]. We will defer most of the specifics to Section 4.2, but highlight the following two results:

Theorem (See Theorems 28 and 29).
  1. 1.

    Under the strong form of our main hardness assumption with d=2, there is a poly(n) time algorithm which produces a matrix M𝔽2n×n which is Valiant-Rigid whenever n is a power of 2.

  2. 2.

    Assuming that there is a language in 𝖳𝖨𝖬𝖤[22n] that is not computable in 𝖳𝖨𝖬𝖤𝖭𝖯[2ϵ2n]/2ϵn (even infinitely often)333This is a particular case of the weak form hardness assumption., there is a language in 𝖤𝖷𝖯 that requires Boolean circuits of size 2npoly(n) for all n.

Crucially, our results give the first standard-form hardness assumptions under which the above explicit construction problems (and others in Section 4.2) have an efficient algorithm, and they do so via a universal approach: we have a single object, 𝖪poly-random strings, whose efficient explicit construction follows from plausible hardness assumptions, and which automatically satisfies various other pseudorandom properties, e.g. rigidity as a matrix or maximal hardness as a Boolean function.

Further Applications.

In Sections 4.3 and 4.4 we discuss two further applications of our random string constructions. The first shows a “hardness condensation” phenomenon for the main hardness assumptions we use in this paper (with the lower bound relativized to a 𝖯𝖲𝖯𝖠𝖢𝖤 oracle), whereby we may amplify the degree of nonuniformity in the lower bounds from 2nϵ to 2no(n). The second addresses the question of the relative difficulty of uniform range avoidance and general (nonuniform) range avoidance posed in [27, 14], where we observe using our main results and some previous hardness results ([14, 1]) that under plausible hardness assumptions, nonuniform range avoidance is significantly harder than the uniform variant.

Barriers to Improving Our Main Results.

In the final section we give two sets of results indicating barriers to improving our main PRGs and random string constructions. Our first result casts our iterated-logarithmic seedlength PRG fooling 𝖯𝖭𝖯 (Theorem 1) as a reconstructive multi-source extractor, similar to Trevisan’s analysis [29] of the classical hardness randomness connection in [26, 15] in terms of reconstructive single-source extractors. We show that any such extractor must have iterated-logarithmic seed length, and thus that any improvement to the seed length of our PRG in Theorem 1 must use substantially different techniques.

Second, we give an oracle separation showing that it is not possible via black-box arguments to obtain our main results from more standard assumptions like 𝖤𝖲𝖨𝖹𝖤[2ϵn]. In [20] it is shown that explicit construction of (𝖪poly,n1) random strings and 2ϵn hard truth tables are equivalent with respect to 𝖭𝖯 oracle reductions. If such an equivalence held without 𝖭𝖯 oracles it would supersede our main results here: we could start with an assumption that 𝖤 requires 2ϵn-size circuits, and then use the reduction to produce from the hard truth table a string with high time bounded Kolmogorov complexity. We show that no such reduction exists which is relativizing. This justifies in some sense the need for more high-end hardness assumptions to achieve our main results.

1.3 Overview of Main Construction

We describe at a high level our construction of pseudorandom generator with small seed length secure against any L{0,1}n computable in 𝖳𝖨𝖬𝖤𝖭𝖯[poly(n)]. Using the classical hardness-randomness connection, under the assumption that 𝖳𝖨𝖬𝖤[2O(n)] requires exponential size 𝖭𝖯-oracle circuits we can obtain a generator G0:{0,1}s0(n){0,1}n with seed length s0(n)O(logn). The key observation is that this first attempt at a generator significantly overshoots our primary goal in one respect: the generator G0 obtained from the generic hardness-randomness connection would in fact be secure against 𝖳𝖨𝖬𝖤𝖭𝖯[poly(n)]/n machines which have access to n bits of advice. On the other hand we only want a generator which fools uniform algorithms.

To use this to our advantage, we consider a recursive argument. Let n1=s0(n), and consider n1 as our new input length. Define the language L1{0,1}n1 consisting of the seeds z of G0 such that G0(z)L. We know that Prz{0,1}n1[zL1]Prx{0,1}n[xL] by the security of our first generator G0, and therefore if we could find a second generator G1:{0,1}s1(n1){0,1}n which fools the language L1, we would find that the composition G0G1 fools the original language L. If s1(n1)=O(logn1)=O(loglogn) we would have made significant progress. For such a generator G1 to fool L1, we would need it to fool 𝖭𝖯-oracle algorithms with running time poly(n); however the input length of L1 is n1=O(logn), so as a function of the new input length we need G1 to fool algorithms running in exponential time with an 𝖭𝖯 oracle. We may proportionally increase the allowable runtime of G1 to exponential as well; so overall we have scaled the time complexities of our generator/distinguisher by an exponential. However, crucially, the advice complexity of deciding L1 remains O(1): this is where we use crucially the uniformity of our distinguisher. If L required n bits of advice compute, then the best advice upper bound we could place on L1 is n2n1 which is trivial.

Continuing in this way, we are able to iteratively reduce the seed length of our original generator by a logarithmic composition, for an arbitrary constant number of phases. The cost is that we need to obtain, in each step, psuedorandom generators with logarithmic seed length that fool algorithms running in exp(exp(exp(n))) time with an 𝖭𝖯 oracle, and which are computable in a comparable amount of time (without an 𝖭𝖯 oracle). To achieve such generators we rely on the standard toolkit of hardness-randomness transformations [26, 15, 30]. In this way, we can argue the security of our construction based on the kind of hardness assumptions described above.

There are a few additional intricacies that we are skimming over here, which contribute to the two caveats in our main theorem statements: the weaker runtime bounds when using the second version of our hardness assumptions, and the requirement that the number n is of the form n=exp[d1](n) for some n when using the first version of our hardness assumptions. The first of these issues roughly stems from the fact that, when trying to apply our hardness assumption to get an O(lognj) seed length generator on input length njlog[j](n) (which we will then compose with the outer generator to reduce the total seed length by another logarithm), we will actually need to apply our hardness assumption on some other input length mj=O(nj); this will mean that we can at best hope for our generator to run in time poly(exp[j](O(nj)))exp[j](O(log[j]n)), which will be superpolynomial when j>1. This issue can be avoided by considering the more refined first version of our hardness hypothesis.

For the second issue, it turns out that we can only guarantee the security of our generator on inputs lengths n such that the number n itself has time-bounded Kolmogorov complexity log[d](n). For d=1 this changes nothing, since every number n has 𝖪O(n)(n)logn via its binary representation, so we achieve a generator with seed length O(loglogn) which is valid on all input lengths, but for large values of d our construction is only valid for infinitely many n (in particular it will work for all n of the form exp[d1](n) for some n). The reason for this issue is as follows: in the above exposition, we considered the language L1{0,1}n1 of seeds mapping to strings in the base language L, and argued it is computable in time exponential in its input length. When we continue this argument for more steps and obtain languages L2,L3,, we would like to argue at each step that they are uniformly computable with a small amount of advice (comparable to the input length they are defined on). However, the definition of these languages depends on the original input length n that we started on, and since log[d]() is not injective for any d>0, we cannot determine n solely from the smaller input lengths, but will need to somehow supply the number n as advice. If we assume the number n is sufficiently compressible, we are able to skirt this issue and complete the argument.

1.4 Related Work

We have already mentioned connections of our work to meta-complexity and Range Avoidance. In terms of derandomizing uniform algorithms, there has been a lot of work on uniform hardness-randomness tradeoffs, beginning with [16], but much of that work is not relevant to our setting as the emphasis is on uniform assumptions for derandomization, while we are interested instead in decreasing the seed length, and are not concerned with uniformity of the assumption.

However the recent work of [6] on super-fast derandomization is indeed very relevant from a technical point of view. They give a (conditional) derandomization of 𝖡𝖯𝖳𝖨𝖬𝖤[T(n)] into 𝖳𝖨𝖬𝖤[T(n)1+ϵn]. The classical approach to derandomizing 𝖡𝖯𝖳𝖨𝖬𝖤[T(n)] would be to model it as a circuit of size T(n) with input length T(n) (the input represents the randomness) and use a PRG fooling this class. To fool this class of circuits requires a PRG with seed length logT(n) and running time T(n), and if T(n)=n3 we could therefore not hope to achieve derandomization in time T(n)1+ϵn no matter how fast the PRGs runtime is. A key observation in [6] is that that this approach to derandomizing 𝖡𝖯𝖳𝖨𝖬𝖤[T(n)] actually overshoots whats required: if we model the 𝖡𝖯𝖳𝖨𝖬𝖤[T(n)] machine more precisely as a 𝖳𝖨𝖬𝖤[T(n)] machine on input length T(n) with nonuniformity n<<T(n), then we only need a PRG which fools 𝖳𝖨𝖬𝖤[T(n)]/n; for this task, it is possible (at least information-theoretically) to achieve a seed length of logn or more generally (1+ϵ)logn rather than logT(n), which means that the enumeration of seeds will only cost us n1+ϵ time.

They then proceed to construct a PRG for 𝖳𝖨𝖬𝖤[T(n)]/n with seed length (1+ϵ)logn and computable in time T(n)1+ϵ. In their case a great deal of work must be dedicated to optimizing the runtime of the PRG which has no relevant parallel in our work; however the question of reducing the seed length down to the true level of uniformity, and the assumptions under which they are able to achieve it, have similarities to ours. In particular, they require assumptions of the form 𝖳𝖨𝖬𝖤[T(n)]𝖳𝖨𝖬𝖤[T(n)1δ]/2(1δ)n where T(n)=2kn for a large constant k.

The innovation in our work is the use of recursion to achieve dramatically smaller seed length against inefficient uniform adversaries using a small amount of non-uniformity, which enables us to address Questions 1 and 2.

2 Preliminaries

We start with some notation for the classes of growth rates appearing in this work:

Definition 4 (Growth Rates).

Both log(),exp() are base 2. For any function f:, d we use f[d] to denote the d-ary composition of f with itself, with f[0] being the identity nn. We say that a function f has elementary growth rate, or that it is “elementary,” if f(n)exp[d](n) for some absolute constant d. We say that a funciton f: is time-constructible if there is a Turing machine M such that M(1n) prints f(n) in binary, and M has running time O(f(n)) on all inputs.

For each d and α+ we define the function Φd,α(n)=exp[d](log(αlog[d1](n))). Using this function we define a hierarchy of O() notations more relaxed than the standard as follows; for f,g:, we say f(n)=Od(g(n)) if f(n)Φd,C(g(n)) for some C and sufficiently large n, and f(n)=Ωd(g(n)) if f(n)Φd,ϵ(g(n)) for some ϵ>0 and sufficiently large n. od(g(n)), ωd(g(n)) are defined analogously.

In this notation we have O1(n)=O(n), O2(n)=poly(n), O3(n)=quasipoly(n). This hierarchy of growth rates enumerates a class of strongly subexponential growth rates having magnitude much closer to n than to 2n: indeed for each fixed d, Od(n) is closed under composition and grows slower than Ωd(2n) for any fixed d. Similarly, Ω1(n)=Ω(n), Ω2(n)=nΩ(1), Ω3(n)=2logΩ(1)n, and for each fixed d the class Ωd(n) has a growth rate much closer to n than to logn, and in particular far exceeds the growth rate of Od(logn) for any fixed d.

Definition 5 (Languages and Complexity Classes).

For functions f,g: and language A{0,1} we define the complexity class 𝖳𝖨𝖬𝖤A[f(n)]/g(n) consisting of those languages decidable in time O(f(n)) with an oracle for the language A and using O(g(n)) bits of advice on inputs of length n. When g(n)=0 we omit this argument.

For a language L{0,1} we use Ln:{0,1}n{0,1} to denote the restriction of L to {0,1}n, and 𝖲𝖨𝖹𝖤[f(n)] to denote the set of languages L so that Ln is computed by Boolean circuits of size O(f(n)) for all n.

We next define our notation for time-bounded Kolmogorov complexity:

Definition 6 (Kolmogorov Complexity).

We fix an efficient universal oracle turing machine 𝒰 once and for all. For an oracle language 𝒪{0,1}, time bound T, x,y{0,1} we use 𝖪T,𝒪(xy) to denote the length of the shortest program π{0,1} so that 𝒰𝒪(π,y) halts with output x in at most T time steps. If 𝒪={} or y is the empty string we omit them from the notation as in 𝖪T(xy), 𝖪T,𝒪(x) respectively.

For a number n, we use 𝖪T,𝒪(n) to denote 𝖪T(bin(n)) where bin(n){0,1}logn is the canonical binary representation of n.

Finally we introduce relevant notation for pseudorandom generators:

Definition 7 (Pseudorandom Generators).

For a family of “distinguishers” 𝒟{0,1}{0,1}n and ϵ[0,1], we say that G:{0,1}s{0,1}n is a pseudorandom generator (PRG) secure against 𝒟 with error ϵ if, for all D𝒟, we have

|Prx{0,1}n[D(x)=1]Prz{0,1}s[D(G(z))=1]|ϵ

We say that G is a hitting set generator (HSG) if it satisfies the weaker condition

(Prx{0,1}n[D(x)=1]>ϵ)(Prz{0,1}s[D(G(z))=1]>0)

Let s:, ϵ:[0,1]. If 𝒞{0,1}{0,1} is a complexity class (set of languages) and G=(Gn:{0,1}s(n){0,1}n)n is an ensemble of generators, we say that G is a PRG (resp. HSG) secure against 𝒞 if, for all L𝒞, there is n0 so that for all n>n0, Gn is a PRG (resp. HSG) secure against {Ln} with error ϵ(n).

2.1 Range Avoidance and Construction of Random Strings

We formalize a general notion of explicit construction problems as follows:

Definition 8 (Explicit Construction Problems).

An explicit construction problem is defined by a language Π{0,1}, such that Πn for all n. The computational task is: given 1n as input, output a string xΠn.

If Π is an explicit construction problems, we say that a function S:{0,1}{0,1} is a “list solution” to Π if S(1n)Πn for all n. The “list size” () is defined as (n)=|S(1n)|.

As discussed in the introduction, an important class of explicit construction problems are those reducible to the problem range avoidance:

Definition 9 (Range Avoidance [18, 20, 27]).

Range avoidance, or “Avoid,” is the following search problem: given a Boolean circuit C:{0,1}m{0,1}n with m<n, output a string x{0,1}mrange(C). We say that this Avoid instance has “stretch” mn.

We say that an explicit construction problem Π reduces to Avoid in polynomial time with stretch function (n), if there is a polynomial time algorithm which, given 1n, outputs an Avoid instance Cn:{0,1}(n){0,1}n so that whenever x is a solution for Cn, we have xΠ.

The key connection between 𝖪poly random strings and explicit construction problems reducible to Range Avoidance is the following:

Observation 10 ([27]).

Say that for every k, there is a polynomial time algorithm which, for every n (resp. infinitely many n) outputs a string x{0,1}n with 𝖪nk(x)(n). Then every explicit construction problem reducible to Avoid with stretch function (n) is solvable in polynomial time .

Proof.

The definition of the reduction implies that there is a polynomial time algorithm Cn:{0,1}(n){0,1}n, so that whenever xrange(Cn) we have xΠ. If k is such that the algorithm constructing Cn runs in time nk, then we observe that every string x in the range of Cn satisfies 𝖪nk(x)(n).

Recall that our approach to constructing 𝖪poly random strings will consist of two steps. First, we will try only to construct a short list of strings, one of which is guaranteed to have 𝖪poly complexity n1: we then concatenate the list to obtain a single string whose complexity degrades by a factor proportional to the length of the list. The second step (concatenation) is justified by the following standard claim:

Observation 11.

Let x1,,xm{0,1}n and let x^=(x1,,xm) be their concatenation. Then for any time bound T and im we have 𝖪TO(mn)(x^)𝖪T(xj)O(logm).

For the first step (obtaining a short list of candidate solutions), we rely on the following observation:

Observation 12.

Let (Gn:{0,1}s(n){0,1}n)n be a hitting set generator secure against 𝖳𝖨𝖬𝖤𝖭𝖯[O(T(n))] with error 12. Then there exists a seed z{0,1}s(n) so that 𝖪T(n)(Gn(z))n1; in other words the range of Gn is a list solution for the set of strings with 𝖪T(n)()n1.

Proof.

At least half of n bit strings have 𝖪T(n)()n1. On the other hand 𝖪T(n)() is computable in time O(T(n)) with an 𝖭𝖯 oracle.

Hence, we have reduced the problem of constructing random strings to constructing hitting generators against 𝖯𝖭𝖯 with short seed length. The next section is dedicated to the construction of such a generator under plausible hardness assumptions. We will in fact produce a pseudorandom generator (despite only needing a hitting set generator).

3 PRGs for Uniform Classes with Near-Optimal Seed Length

The main goal in this section is to produce a pseudorandom generator computable in a poly(n) time, or more generally Od(n) time for some fixed constant d, which is secure against uniform 𝖯𝖭𝖯 algorithms and has seed length significantly smaller than logn. We will phrase all of our assumptions/generator construction with respect to an arbitrary oracle 𝒪 in place of 𝖭𝖯 for the sake of generality.

We will consider here two qualitatively distinct classes of hardness assumption used to instantiate our generators, each parameterized by an oracle 𝒪 (typically we set 𝒪=𝖭𝖯) and a constant d:

Hypothesis 1 (Strong Assumption for 𝒪,d, abbreviated 𝖲𝖧(𝒪,d)).

There is ϵ>0 so that for all time constructible T(n)(exp[d](n))O(1) and all time constructible m:, nm(n)poly(n) the following holds: there is a sequence of strings (fn{0,1}m(n))n so that the map nfn is computable uniformly in T(n) time, but no machine running in time T(n)ϵ with an 𝒪 oracle and m(n)ϵ bits of advice can compute fn for more than finitely many n. In the case T(n)poly(n), we only require the assumption to hold in the case m(n)=n.

We refer to m(n) as the “length bound” in the above.

Hypothesis 2 (Weak Assumption for parameters 𝒪,d,v, d1, abbreviated 𝖶𝖧(𝒪,d,v)).

Let 1dd and T(n)=exp[d](n). There is some constant ϵ>0 and a language L computable in 𝖳𝖨𝖬𝖤[T(n)] which is not computable in 𝖳𝖨𝖬𝖤𝖭𝖯[Φv,ϵ(T(n))]/Φv,ϵ(2n) on more than finitely many input lengths.

Note that 𝖶𝖧(𝒪,1,2) translates to the assumption that 𝖤 requires 2Ω(n) circuit complexity with 𝒪 oracles, which is the standard regime in which polynomial time generators fooling 𝖯𝒪 with logarithmic seedlength can be constructed by known methods.

3.1 Fooling 𝗧𝗜𝗠𝗘𝗡𝗣[𝑻(𝒏)] with 𝑶(𝐥𝐨𝐠𝒏) Seed Length

The first step in our construction is to use classical hardness-randomness constructions to give pseudorandom generators which fool 𝖳𝖨𝖬𝖤𝖭𝖯[T(n)] with O(logn) seed length and runtime T(n) in the case T(n) is very large, under the appropriate hardness assumptions. Depending on which assumption we use we get a generator with different parameters:

Lemma 13.

Assume 𝖲𝖧(𝒪,d). Then for every time constructible T(n)(exp[d](n))O(1) there is a PRG (Gn:{0,1}s(n){0,1}n)n computable uniformly in time poly(T(n)) which fools 𝖳𝖨𝖬𝖤𝒪[T(n)]/3n.

Lemma 14.

Assume 𝖶𝖧(𝒪,d+1,v), d0, v2. Then for any k there is a PRG (Gn:{0,1}s(n){0,1}n)n computable uniformly in time Od+v(n) which fools 𝖳𝖨𝖬𝖤𝒪[nk]/3n and has seed length s(n)Ov1(logn).

We will prove the first case (Lemma 13) here, and reserve Lemma 14 for the Appendix as it’s proof is similar. We require the following standard tool from complexity-theoretic pseudorandomness:

Theorem 15 (Black-Box Hardness-Randomness Connection [15]).

For each ϵ>0 exists a constant c,c3ϵ and a uniform algorithm 𝖨𝖶 with the following behavior. On input n and given oracle access to a function f:{0,1}(n){0,1} with (n)=clogn, 𝖨𝖶f computes a function 𝖨𝖶f:{0,1}s(n){0,1}n in poly(n) time for s(n)O((n)) such that for any D:{0,1}n{0,1} with

|Prx{0,1}n[D(x)=1]Prz{0,1}s(n)[D(𝖨𝖶f(z))=1]|1n

there exists a circuit C with D oracle gates computing f whose total size is at most 2ϵ(n).

Proof of Lemma 13.

We will invoke 𝖲𝖧(𝒪,d) on some yet to be determined time bound T(n) and length bound m(n), with respect to oracle 𝒪 and constant d; let ϵ>0 be the implied constant guaranteed by the hypothesis (which does not depend on the choice of T, m).

Next, invoke Theorem 15 with parameter δ:=ϵ/2, and let c be the guaranteed constant from this Theorem. So there is a function 𝖨𝖶f:{0,1}s(n){0,1}n which, given oracle access to a function f:{0,1}(n){0,1} with (n)=clogn, runs in poly(n) time with poly(n) oracle calls to f, and such that whenever D:{0,1}n{0,1} distinguishes 𝖨𝖶f from uniform, there is a D-oracle circuit C of size 2δ(n) computing f.

Now we set m(n)=2(n)=2clogn and let T(n)=T(n)3k, which are both time constructible, and use these in our specific invocation of 𝖲𝖧(𝒪,d) as hinted previously. We then obtain an algorithm which, in time T(n)3k, computes a string fn{0,1}m(n), which cannot be computed by by any machine running in time T(n)ϵ=T(n)3 with m(n)ϵ=2ϵ(n) bits of advice. We then claim that, setting Gn=𝖨𝖶fn:{0,1}(n){0,1}n gives the required generator. By assumption it is computable in time T(n)poly(n)poly(T(n)). On the other hand if it were distinguished in time T(n) with 3n bits of advice, we’d obtain a circuit computing every bit of fn of size 3n+2ϵ2(n) with oracle gates that can be evaluated in time T(n), hence overall we would be able to compute fn in time T(n)m(n)ϵ2+O(nT(n))=T(n)n12+O(T(n)n)<n3 with 3n+m(n)ϵ2<m(n)ϵ bits of advice, provided n=o(m(n)ϵ). Recalling from our application Theorem 15 that cδ3, we have that m(n)ϵ2Ω(ncϵ2)=Ω(n32) and we are done.

3.2 Recursive Generator Construction

We give here our main construction of a generator fooling languages decidable in polynomial time with an 𝖭𝖯 oracle under a natural computational hardness assumption, specifically the assumptions 𝖲𝖧(𝖭𝖯,d), d. Our construction will have seed iterated-logarithm type seed length length log[O(1)]n for infinitely many n; in particular it will work for all n such that the number n has time-bounded Kolmogorov complexity comparable to the seed length. Afterwords we show that a similar construction gives a generator fooling the same class with a slower runtime under the weak form hardness assumptions; the proof of this construction will be relegated to the appendix.

Theorem 16.

Let 𝒪{0,1} be any oracle, d,v, and assume Hypothesis 𝖲𝖧(𝒪,d+v). Let T(n)(exp[v](n))O(1) be time constructible. There exists a generator (𝒢nd:{0,1}sd(n){0,1}n)n computable uniformly in poly(T(O(n))) time, so that the following hold:

  1. 1.

    For all n, sd(n)O(log[d+1](n))

  2. 2.

    For all L𝖳𝖨𝖬𝖤𝒪[T(n)]/log[d](n), we have

    |Prz{0,1}s(n)[𝒢n(z)L]Prx{0,1}n[xL]|O(1log[d](n))

    for all but finitely many n{n𝖪T(n)(n)log[d](n)}.

Proof.

Using our hardness assumption 𝖲𝖧(𝒪,d+v) in combination with Lemma 13, for each time constructible bound R: bounded above by (exp[d+v](n))O(1) there is a constant c(R) and a pseudorandom generator (Gn(R):{0,1}c(R)logn{0,1}n)n computable in 𝖳𝖨𝖬𝖤(R(n)c(R)) which fools 𝖳𝖨𝖬𝖤B[R(n)]/3n with error 1n. Using c() we may define a sequence of constants (ci)i and functions (Td,fd:)d{0} as follows:

  1. 1.

    T0=T, f0=(xx).

  2. 2.

    Set cd=c(Td1), fd=cdlogfd1(), Td=4Td1(exp(cd))cd

Observe that 3Td1(fd1(n))+Td1(fd1(n))cdTd(fd(n)).

Then for each d, we have the generator (Gn(Td))n defined for every input, which we abbreviate as Gd. We use this family of generators to construct the generators (𝒢nd)n promised in the theorem statement. In particular, we define for every d{0,,} the generator

(𝒢nd:{0,1}fd+1(n){0,1}n)n,𝒢nd=Gn0Gf1(n)1Gfi(n)d

We prove its security by induction, using the following stronger induction hypothesis for each d0:

  1. 1.

    𝒢nd is computable in time Td+1(fd+1(n))3T(n) with the number n and O(1) additional bits as advice

  2. 2.

    𝒢nd fools any L{0,1}n, L𝖳𝖨𝖬𝖤B[Td(fd(n))]/fd(n) with error jdfj(n)1 for all sufficiently large n satisfying KT(n)(n)fd(n)

For the base case, we know that 𝒢n0=Gn0 which, by assumption, fools 𝖳𝖨𝖬𝖤B[T(n)]/n with error 1n. Recall that T1(f1(n))3T0(f0(n))+T0(f0(n))c1=3T(n)+T(n)c1, hence 𝒢n0=Gn0 is computable in time T(n)c1T1(f1(n))3T(n) (with no advice). Now assume the induction hypothesis up to d1. Let L{0,1}n be given, n sufficiently large, so that L𝖳𝖨𝖬𝖤B[Td(fd(n))]/fd(n) and and 𝖪T(n)(n)fd(n). Define Ld{0,1}fd(n), Ld={x𝒢d1(x)L}. By induction, 𝒢d1 is computable in time Td(fd(n))3T(n) given the number n as advice, and O(1) additional advice bits; hence Ld is computable in 𝖳𝖨𝖬𝖤B[Td(fd(n))]/3fd(n): if we are given the number n as advice, then we may determine f0(n),,fd(n) using the number d as advice which costs O(1), after which we can compute the generator 𝒢nd1, and the language L using fd(n) additional bits of advice (together with some O(1) bits to specify the algorithm described in the current sentence). Under the assumption 𝖪T(n)(n)log[i](n)fi(n) we may produce the number n from an additional fi(n) bits of advice, for a total advice cost 3fi(n), and the time of the additional operations (beyond the original computation of 𝒢d1) is bounded by 3T(n).

Now, using the second part of our induction hypothesis, we determine that 𝒢ni1 fools L, in particular we have:

|Prz{0,1}fi(n)[𝒢ni1(z)L]Prx{0,1}n[xL]|j<ifj(n)1

On the other hand, using the security of the generator Gnd on input length fd+1(n) (which is sufficiently large), we have that Gnd fools Ld, in particular:

|Prw{0,1}fd+1(n)[Gnd(w)Li]Prz{0,1}fd(n)[zLd]|fd(n)1

Recalling that 𝒢nd=𝒢nd1Gnd and combining the previous two inequalities using the triangle inequality we have:

|Prz{0,1}fd+1(n)[𝒢nd(w)L]Prx{0,1}n[xL]|jdfj(n)1

which establishes the second condition in our inductive hypothesis. For the first, note that to compute 𝒢nd, we need only recover the number n from its shortest 𝖪T() description in time T(n), compute 𝒢ni1 which takes time Ti(fi(n))3T(n) by induction, and finally compute Gni which takes time Ti(fi(n))ci+1; overall this is bounded by Ti+1(fi+1(n))3T(n).

At this point the theorem is proven; it remains to verify a few bounds:

  1. 1.

    jdfj(n)1O(fd(n)1) so that the error bound is as stated in the theorem.

  2. 2.

    fd(n)O(log[d](n))

  3. 3.

    Td(fd(n))poly(T(O(n)))

The first and second hold trivially. For the last, we prove it by induction; more specifically we will show by induction that Td(O(fd(n)))T(O(n))O(1). In the base case, T0(O(f0(n)))=T(O(n)). We then have that

Td(O(fd(n)))=4Td1(O(exp(cdlogfd1cd)))cd (1)
(Td1(O(exp(logfd1(n)+O(1))))O(1) (2)
=Td1(O(fd1(n)))O(1)T(O(n))O(1) (3)

where the last step uses the induction hypothesis.

We highlight an important special case of the above:

Theorem 17.

Assume that for some ϵ>0 and every time constructible T(n)2O(n), m(n)poly(n), there is a function n{0,1}m(n) computable uniformly in time T(n) which is not computable in 𝖳𝖨𝖬𝖤𝖭𝖯[T(n)ϵ]/m(n)ϵ even infinitely often. Then for every k, there is a pseudorandom generator family (𝒢n)n with seed length O(loglogn) which fools 𝖳𝖨𝖬𝖤𝖭𝖯[nk]/logn for all sufficiently large n.

Proof.

We will set v=0, d=1, T(n)=nk in Theorem 16. Observe that for all n, we have 𝖪O(n)(n)logn since we may encode the number n in binary.

If we rely instead on the weak form hypotheses 𝖶𝖧() we get:

Theorem 18.

Assume Hypothesis 𝖶𝖧(𝒪,d+1,v) with d0, v2 fixed constants, and k a fixed constant. Then there is a pseudorandom generator with seed length Ov1(log[d+1](n)) and runtime Od+v(n) that fools 𝖳𝖨𝖬𝖤𝒪[nk]/log[d](n) with error (2log[d](n))1 whenever 𝖪nk(n)log[d](n).

As in the case of Lemma 14, we relegate the proof to the appendix since it uses essentially the same ideas as that of Theorem 16 and is in fact a bit simpler. As before we highlight an important special case, obtained immediately from Theorem 18 by setting 𝒪=𝖭𝖯, d=1, v=2:

Theorem 19.

Assume that there is some ϵ>0 and a language L𝖳𝖨𝖬𝖤[22n] which is not computable in 𝖳𝖨𝖬𝖤𝖭𝖯[2ϵ2n]/2ϵn on more than finitely many input lengths. Then for any k there is a pseudorandom generator fooling 𝖳𝖨𝖬𝖤𝖭𝖯[nk] with seedlength O(loglogn) and quasipolynomial runtime.

3.3 Fooling Uniform Deterministic Time

For our application to construction of random strings, we will use the generator in the previous section with oracle setting 𝒪=𝖭𝖯. It is nonetheless natural to consider the implications of our generator in the case 𝒪=; in this case we obtain a generator with o(logn) seed length fooling 𝖳𝖨𝖬𝖤[T(n)] under plausible hardness assumptions. However, we demonstrate here that in the regime of a deterministic distinguisher, once O(logn) seed length is achieved for 𝖳𝖨𝖬𝖤[poly(n)], we can reduce the seed length all the way down to an arbitrarily small super-constant value; more generally, we can fool 𝖳𝖨𝖬𝖤[poly(n)]/a(n) with essentially optimal seed length O(loga(n)) for any time constructible a(n)logn. The discrepancy between the simplicity of the result here and the work required in the previous section is due to the key distinction that in this regime, the pseudorandom generator has enough resources to simulate the distinguishers it is trying to fool.

Theorem 20.

Let a(n)logn be time constructible. Assuming 𝖤 requires 2Ω(n)-size Boolean circuits for sufficiently large n, there is a polynomial time computable generator 𝒢a:{0,1}O(loga(n)){0,1}n which fools 𝖳𝖨𝖬𝖤[n]/a(n) with error 13.

Proof.

Using the hardness assumption we obtain (uniformly in n) a pseudorandom generator with seed length clogn for some constant c. Setting m=nc and enumerating the outputs of our PRG, we obtain a list of n bit strings x1,,xm{0,1}n which is a pseudorandom generator for 𝖳𝖨𝖬𝖤[n]/n. Let r=2a(n), let L1,,Lr enumerate 𝖳𝖨𝖬𝖤[n]/a(n) machines. Define the matrix M{0,1}m×r with M(i,j)=Lj(xi); we may compute M in poly(n) time.

We now deterministically construct a set I[m] of size a(n)O(1) so that for every j we have

|PriI[M(i,j)=1]Pri[m][M(i,j)=1]|13

If we can accomplish this, we output the condensed PRG whose range consists of the strings {xiiI} and has seed length log|I|=O(loga(n)) and are finished. The algorithm to find the set I is given in the next lemma.

Lemma 21.

There is a polynomial time algorithm which, given M{0,1}m×r with rmO(1), outputs a set I[m] of rows so that |I|O(logr), and for every column j[r], we have

|PriI[M(i,j)=1]Pri[m][M(i,j)=1]|13

Proof.

Using known results on the so-called “set balancing problem” [25], for any I[m] we may efficiently compute II, |I||I|2+|I|logr so that

|PriI[M(i,j)=1]PriI[M(i,j)=1]|logr|I|

for every j. In particular, provided |I|logr14|I|, i.e. |I|16logr, we have that |I|34|I|. Initializing I0=[m], we may apply the above to obtain I0I1Iq, |Iq|=Θ(logr), and for every j

|PriI[M(i,j)=1]PriI+1[M(i,j)=1]|logr|I|

Hence

|Pri[m][M(i,j)=1]PriIq[M(i,j)=1]|qlogr|I|

For a suitable choice of q we will have log|Iq|=ϵ for an arbitrarily small constant ϵ, and logr|I|34logr|I+1|, hence

|Pri[m][M(i,j)=1]PriIq[M(i,j)=1]|qlogr|I|ϵ=0(34)<13

3.4 Optimality of the Seed Length

We include a simple (and basically standard) argument which indicates that the seed lengths obtained in the previous are essentially optimal with respect to the amount of advice they can handle:

Lemma 22.

Let s(n)logn be time-constructible, and G=(Gn:{0,1}s(n){0,1}n)n an arbitrary family of generators which fools 𝖳𝖨𝖬𝖤[n]/a(n) almost everywhere with error 12. Then s(n)loga(n).

In particular, if G fools 𝖳𝖨𝖬𝖤[n] then s(n)λ(n) for some inverse-total computable λ(n)=ω(1).

Proof.

Let (n)=s(n)+1, and consider the family of languages L with Ln(x) depending only on the first (n) bits of x, and |Ln|=2n1. For each nN, we have that |range(Gn)|2(n)1, hence there exists a language L so that for all n sufficiently large, we have range(Gn)Ln=. On the other hand for every n we have Prx{0,1}n[xL]=12, hence Gn fails to fool the language Ln for all n sufficiently large. Every language L is decidable in deterministic time n+2(n) with 2(n) bits of advice (we only need that s(n), and hence (n) are time constructible) so we get a contradiction if 2(n)O(n) or 2(n)a(n).

In the case of deterministic time (Section 3.3) this implies that the stated construction is optimal for advice complexity logn. For our main construction fooling 𝖳𝖨𝖬𝖤𝖭𝖯, we have no indication that to fool uniform languages (no advice) requires iterated-logarithmic seed length. However Theorem 16 is able to fool languages with advice complexity log[d](n) with seed length O(log[d+1](n)), so in this sense we are able to achieve the maximum advice complexity for the given seed length O(log[d+1](n)). It is consistent with our current knowledge that fooling completely uniform algorithms, even with an 𝖭𝖯 oracle, is achievable with arbitrarily slow-growing time constructible seed lengths, as we can achieve in the case of deterministic adversaries (under standard hardness assumptions).

4 Construction of Random Strings and Applications

In this section we discuss the applications of our main result to explicit constructions of random strings and circuit lower bounds.

4.1 Random String Construction Via PRG Concatenation

We start by applying our PRG constructions to give polynomial time explicit constructions of highly random strings. The first is a direct combination of our main PRG constructions with Observations 11 and 12 from the preliminaries. We start with the situation in which we use the strong form of our hardness assumptions, and hence Theorem 16 as our PRG:

Theorem 23.

Let d, T(n)(exp[c](n))O(1), and assume Hypothesis 𝖲𝖧(𝖭𝖯,d+c). Then for any constant k, there is a polynomial time algorithm 𝒜 so that, for all n with 𝖪T(n)(n)log[d](n), 𝒜(1n){0,1}n is an n-bit string x with 𝖪T(n)(x)Ω(n(log[d](n))O(1)).

In the special case d=1, c=0, under a hardness assumption for singly exponential time bounds and 𝖭𝖯 oracles we obtain an algorithm 𝒜 which produces, for every n, a string x of length n with 𝖪T(n)(x)npoly(logn) for all n.

More generally, a hardness assumption for exponential time bounds and 𝒪 oracles yields an efficient algorithm for computing strings with Ω~(n) 𝒪-oracle time-bounded Kolmogorov complexity, provided there is a poly(T(n)) time 𝒪 oracle algorithm which can test the 𝖪T(n),𝒪 complexity of an n-bit string.

Proof.

This is a rather direct corollary of Theorem 16. Let n be given with 𝖪T(n)(n)log[d](n), and let sd()O(log[d+1]()) be the seed length bound of the generator 𝒢d from Theorem 16 using the base time bound T(n). Let n be the largest integer such that n2sd(n)n; clearly 𝖪T(n)(n)𝖪T(n)(n)+O(1)log[d](n), hence we may apply Theorem 16 to compute (uniformly) a pseudorandom generator 𝒢nd:{0,1}sd(n){0,1}n which fools 𝖳𝖨𝖬𝖤𝖭𝖯[T(n)]/O(1) on input length n. By Observation 12 we conclude that some string z in the range of 𝒢n has 𝖪T(n)(z)n1, hence the concatenation of all strings in the range of 𝒢nd (padded at the end with 0s to bring the length up to n) will have 𝖪T(n)() complexity at least nsd(n)n(log[d](n))O(1) (recall Observation 11).

The second part of the theorem is the special parameter setting given by Theorem 17, and for the last sentence in the statement, we use the fact that Theorem 17 works for any oracle 𝒪.

If we instead rely on the weaker form hardness assumptions and the associated PRG from Theorem 18 we obtain via the exact same argument:

Theorem 24.

Assume Hypothesis 𝖶𝖧(𝖭𝖯,d+1,v) with d0, v2. Then for every k, there is a Od+v(n) time algorithm 𝒜 such that for all sufficiently large n, 𝒜(1n){0,1}n is a string x satisfying 𝖪nk(x)nOv(log[d](n)).

In particular, assuming that there is a language in 𝖳𝖨𝖬𝖤[22n] which cannot be decided in 𝖳𝖨𝖬𝖤𝖭𝖯[2ϵ2n]/2ϵn infinitely often, there is a quasipolynomial time algorithm which outputs strings x{0,1}n with 𝖪nk(x)npoly(logn).

More generally, a hardness assumption for exponential time bounds and 𝒪 oracles yields an efficient algorithm for computing strings with Ω~(n) 𝒪-oracle time-bounded Kolmogorov complexity, provided there is a poly(T(n)) time 𝒪 oracle algorithm which can test the 𝖪T(n),𝒪 complexity of an n-bit string.

We now move on to a second class of constructions which achieves a far superior degree of randomness, but the cost is that our construction only works for infinitely many n, and unlike the previous we have no control over which specific values of n our construction succeeds on.

Theorem 25.

Assume Hypothesis 𝖲𝖧(𝖭𝖯,d). Then for any constant k, there is a polynomial time algorithm 𝒜 so that, for infinitely many n, 𝒜(1n){0,1}n is an n-bit string x with 𝖪nk(x)npoly(log[d]n).

Proof.

Applying the assumption with Theorem 16 and the argument in the previous proof, we have an algorithm 𝒜 which prints, for all sufficiently large n in Nd:={nn=exp[d+1](n) for some n}, a list of poly(log[d](n)) strings, one of which must have 𝖪nk(n)n1. We now define a second algorithm 𝒜, which, given n, computes the largest value mn with mN, and prints the (nm)th element of the list 𝒜(1m) (if (nm) is larger than the length of the list, it does something arbitrary). For every mN, there is some nm+poly(log[d](n)) which prints a string x of length m with 𝖪nk(x)m1npoly(log[d](n)).

As before, a modified variant of this construction can be done in the case of the alternative weak form hardness assumptions, where the cost of using the weaker kind of assumption is that the run time of the construction will degrade correspondingly; we leave the details to the interested reader as we will not use this variant in what follows.

4.2 Explicit Constructions Under Plausible Hardness Assumptions

By our main result, under reasonable hardness assumptions we are able to obtain explicit constructions of strings with time-bounded Kolmogorov complexity nβ(n), where β(n) grows like an arbitrarily slow iterated logarithm function. We show here how to use this construction to build various important pseudorandom objects under our main hardness assumptions. We start with a list of fundamental explicit construction problems, focusing on those studied originally in [20], in addition to so-called “incompressible functions” considered in [3]. For background on substantial literature dedicated to these problems see [20, 3],

Definition 26 (List of Explicit Construction Problems).
  • (s,r)-Rigidity: A matrix M{0,1}n×n such that MS+R whenever S,R𝔽2n×n, S has at most s nonzero entries, R has rank at most r

  • s-Hard Boolean Functions: A Boolean function f:{0,1}n{0,1} not computed by any Boolean circuit of size s

  • (s,k)-Incompressible Boolean Functions: A Boolean function f:{0,1}n{0,1} which cannot be expressed in the form f(x)=g(C(x)) where C:{0,1}n{0,1}k is a circuit of size s and g:{0,1}k{0,1} is an arbitrary Boolean function

  • s-𝖯𝖲𝖯𝖠𝖢𝖤𝖼𝖼-Complexity: A communication matrix M{0,1}2n×2n such that M cannot be solved by space-s communication protocols

  • (s,t)-Bit-Probe Complexity: A data structure problem D{0,1}2n×2n which requires space s or time t in the bit-probe model.

  • Ramsey: A graph G(n2) with no clique or independent set of size 2.1logn

We then have:

Lemma 27.

Say that a Range Avoid instance has “stretch nm” if it is of the form C:{0,1}n{0,1}m. The following uniform reductions exist from the problems in Definition 26 to Range Avoidance:

  1. 1.

    (s,r)-Rigidity reduces uniformly to Avoid with stretch (2slogn+2nr)n2

  2. 2.

    s-Hard Boolean Function reduces uniformly to Avoid with stretch (1+o(1))slogs2n

  3. 3.

    (s,k)-Incompressible Boolean Functions reduces uniformly to Avoid with stretch 2k+(1+o(1))slogs2n

  4. 4.

    s-𝖯𝖲𝖯𝖠𝖢𝖤𝖼𝖼-Complexity reduces uniformly to Avoid with stretch 2O(s)+n22n

  5. 5.

    (s,t)-Bit-Probe Complexity reduces uniformly to Avoid with stretch s2n+2t+n22n

  6. 6.

    Ramsey reduces uniformly to Avoid with stretch nΩ(logn)n

Proof.

All of these proofs occur in [20], with the exception of Incompressible Functions. For this, if f(x)=g(C(x)) for a circuit C:{0,1}n{0,1}k of size s and g:{0,1}k{0,1}, we may represent the circuit C (gate by gate) using (1+o(1))slogs bits via a standard encoding, and the function g via its 2k-bit truth table. From these it is clear how we may reproduce the function f efficiently.

Combining this with our first generator construction from the previous section and Observation 10, we have:

Theorem 28.

Assume Hypothesis 𝖲𝖧(𝖭𝖯,d). Then for all n with 𝖪poly(n)(n)log[d](n), we have the following:

  1. 1.

    Polynomial time construction of matrices in 𝔽2n×n which are
    (n2lognpoly(log[d](n)),npoly(log[d](n)))-rigid. In particular, for any d>2 such matrices achieve Valiant rigidity.

  2. 2.

    Boolean functions in 𝖤=𝖳𝖨𝖬𝖤[2O(n)] with circuit complexity at least 2nnpoly(log[d1](n))

  3. 3.

    Boolean functions in 𝖤=𝖳𝖨𝖬𝖤[2O(n)] which are (Ω(2nnpoly(log[d1](n))),nO(log[d]n)) incompressible

  4. 4.

    Polynomial time construction of communication matrices M{0,1}2n×2n which require space Ω(n) (for any setting of d)

  5. 5.

    Polynomial time construction of data structure problems D{0,1}2n×2n which require space Ω(2n) or time Ω(n) in the bit probe model (for any setting of d)

In the important case d=1, each of the constructions works for all n and requires the hardness assumption only for singly-exponential time bounds.

We arrive at similar conclusions if we use the weak form hypothesis in combination with Theorem 18 instead, at the cost of a slow-down in our construction algorithms. We highlight below the results obtained by using the special case of Theorem 18 given in Theorem 19:

Theorem 29.

Assume that there is some ϵ>0 and a language L𝖳𝖨𝖬𝖤[22n] which is not computable in 𝖳𝖨𝖬𝖤𝖭𝖯[2ϵ2n]/2ϵn on more than finitely many input lengths (this is Hypothesis 𝖶𝖧(𝖭𝖯,1,2)). Then we have the following for all n:

  1. 1.

    Quasipolynomial time construction of matrices in 𝔽2n×n which are (n2(logn)O(1),n(logn)O(1))-rigid.

  2. 2.

    Boolean functions in 𝖤𝖷𝖯=𝖳𝖨𝖬𝖤[2nO(1)] with circuit complexity at least 2npoly(n)

  3. 3.

    Boolean functions in 𝖤𝖷𝖯=𝖳𝖨𝖬𝖤[2O(n)] which are (Ω(2npoly(n)),nO(logn)) incompressible

  4. 4.

    Quasipolynomial time construction of communication matrices M{0,1}2n×2n which require space Ω(n) (for any setting of d)

  5. 5.

    Quasipolynomial time construction of data structure problems D{0,1}2n×2n which require space Ω(2n) or time Ω(n) in the bit probe model (for any setting of d)

The above results cover the vast majority of explicit construction problems considered originally in [20]. A notable exception is the construction of near-optimal Ramsey graphs; here the stretch nΩ(logn)n given by Lemma 27 is to small to apply Theorem 23, and we must appeal instead to Theorem 25. Using this approach, we will be able to construct near-optimal Ramsey graphs infinitely often, provided we modify appropriately our encoding of graphs by strings to be well-defined on every input length:

Definition 30 (Ramsey Graphs of Every Length).

We associate every every string x{0,1} to a graph Gx as follows. Let n=|x| and set n to be the greatest integer so that (n2)n. Set m=(n2), truncate x to the first m bits and interpret it as a graph on (n2) vertices canonically. Under this encoding, we say that x encodes a Ramsey graph if Gx contains no clique or independent set of size 2.1logn.

We then have:

Lemma 31.

Assume Hypothesis 𝖲𝖧(𝖭𝖯,2). Then there is a polynomial time algorithm which, for infinitely many n, outputs a string x{0,1}n so that Gx is Ramsey.

Proof.

We claim that we may uniformly construct a Range Avoidance instance C:{0,1}nΩ(n){0,1}n so that for all n, range(Cx) contains every x such that Gx fails to be Ramsey. In particular, given n we may compute efficiently the parameters n, m being the number of vertices and edges for the graphs {Gxx{0,1}n}; we then apply Lemma 27 to obtain (uniformly) a Range Avoidance instance with stretch mΩ(logm)m covering all strings z{0,1}m which fail to encode Ramsey graphs. If x{0,1}n fails to be Ramsey, then x=zy for some z{0,1}m which fails to be Ramsey, hence we may uniformly construct an Avoid instance covering every non-Ramsey string x{0,1}n with stretch nΩ(logn)n, which is nΩ(logn)n. We thus conclude that 𝖪poly(n)(x)nΩ(logn) whenever Gx fails to be Ramsey. Applying Theorem 25 and our hardness assumption yields the lemma.

4.3 Hardness Condensation

We also apply Theorem 23 to derive a new hardness condensation result. Hardness condensation is a phenomenon where a mild hardness assumption can be transformed into a much stronger hardness assumption of the same type. It was first studied in [4], who showed hardness condensation results for complexity classes with advice. It is more interesting to get hardness condensation for uniform classes, and indeed [20] achieves this for 𝖤𝖭𝖯 - he shows that 𝖤𝖭𝖯 requires exponential-size circuits iff it requires almost maximum-size circuits.

We obtain a new hardness condensation result for deterministic time without an 𝖭𝖯-oracle. Here the condensation is with respect to non-uniform complexity - the stronger hardness assumption can handle almost a maximum non-trivial amount of non-uniformity, while the weaker hardness assumption only involves an exponential amount of non-uniformity.

Theorem 32.

The following ar equivalent:

  1. 1.

    For every d there exists v so that, for all time constructible T(n)exp[d](n), we have 𝖳𝖨𝖬𝖤[T(n)] is not contained even infinitely often in 𝖲𝖯𝖠𝖢𝖤[ov(T(n))]/ov(2n)

  2. 2.

    For every d there exists v so that, for all time constructible T(n)=exp[d](n), we have 𝖳𝖨𝖬𝖤[T(n)] is not contained even infinitely often in 𝖲𝖯𝖠𝖢𝖤[ov(T(n))]/2nω(logn)

Proof.

Clearly the second item implies the first. We show that the first item implies the second. Assuming (1), we have that every every d there exists v so that 𝖶𝖧(𝖰𝖡𝖥,d,v) holds. Applying Theorem 24, for every d there is v so that for any time constructible Texp[d](n), there is an algorithm A running in time Ov(T(2n)) which, given 12n, outputs a string of length 2n which cannot be produced by any algorithm running in space T(2n) with 2nω(logn) bits of advice 444We are using here the fact that 𝖲𝖯𝖠𝖢𝖤[T(n)] can be simulated in 𝖳𝖨𝖬𝖤𝖰𝖡𝖥[poly(T(n))]. Interpreting this string as a function f:{0,1}n{0,1}, we obtain a language in 𝖳𝖨𝖬𝖤[Ov(T(2n))] which is not contained even infinitely often in 𝖲𝖯𝖠𝖢𝖤[T(2n)]/2nω(logn). Reparameterizing the time bounds yields (2).

4.4 Range Avoidance vs Uniform Range Avoidance

We use our results together with previous work to argue that the Range Avoidance problem becomes much more tractable for uniformly computable maps. While there is compelling complexity-theoretic evidence in various settings that Range Avoidance is intractable in general, our results show that Uniform Range Avoidance is tractable under plausible complexity-theoretic assumptions.

The first setting we consider is where the Range Avoidance instance is an arbitrary polynomial-size circuit C:{0,1}n{0,1}m, and we wish to solve Range Avoidance in polynomial time.

Ilango, Li and Williams [14] showed that in this setting, Range Avoidance is infeasible, under the standard assumption that 𝖭𝖯𝖼𝗈𝖭𝖯 and the plausible assumption that subexponentially-secure indistinguishability obfuscation exists [17].

Theorem 33 ([14]).

If 𝖭𝖯𝖼𝗈𝖭𝖯 and subexponentially-secure indistinguishability obfuscation exists, then for any c0, there is no polynomial-time algorithm which solves Range Avoidance for polynomial-size circuits mapping n to nc bits.

Corollary 34.

Assume that 𝖭𝖯𝖼𝗈𝖭𝖯, subexponentially-secure indistinguishability obfuscation, and Hypothesis 𝖲𝖧(𝖭𝖯,1). There is a constant c such that for all dc, there is a polynomial-time algorithm which solves Range Avoidance on uniform circuits of size nd mapping n bits to nd bits but no polynomial-time algorithm which solves Range Avoidance on all circuits of size nd mapping n bits to nd bits.

Proof.

We show that we can take c to be 1+δ, for arbitrarily small δ>0. Indeed, for such c and dc, the intractability of general Range Avoidance follows from Theorem 33, under the given assumptions. We show that uniform Range Avoidance can be solved by applying Theorem 23 and using the third assumption. Indeed, by this assumption, for every constant k, there exists a polynomial-time algorithm Ak which for all n, on input 1n outputs a length-n string of 𝖪nk complexity at least n/𝗉𝗈𝗅𝗒𝗅𝗈𝗀(n).

Let {Cn} be a sequence of uniform circuits mapping n bits to nd, where d is any constant greater than 1. Here our notion of uniformity is standard 𝖣𝖫𝖮𝖦𝖳𝖨𝖬𝖤-uniformity, but the argument can be adapted to work for more relaxed notions of uniformity. Note that any output y=Cn(x) has 𝖪nk complexity at most n+O(log(n)) for any k>d, as we can first generate Cn given n using the uniformity condition and then generate y from Cn and x by simulating Cn. By running Ak on input 1nd, we obtain a string of length nd and 𝖪nkd complexity at least nd/𝗉𝗈𝗅𝗒𝗅𝗈𝗀(n), which is therefore a non-output of Cn when n is large enough.

We also consider the setting where the Range Avoidance instance is an arbitrary polynomial-size 𝖳𝖰𝖡𝖥-oracle circuit C:{0,1}n{0,1}m, and we wish to solve Range Avoidance in polynomial time. The negative evidence for this is even stronger - the task is intractable under the standard assumption 𝖯𝖲𝖯𝖠𝖢𝖤𝖭𝖯. Somewhat surprisingly, we show that uniform Range Avoidance is tractable under plausible assumptions even though the uniform Range Avoidance instance C is allowed to use a 𝖳𝖰𝖡𝖥-oracle.

Theorem 35 ([1]).

Suppose 𝖯𝖲𝖯𝖠𝖢𝖤𝖭𝖯 (resp. 𝖯𝖲𝖯𝖠𝖢𝖤𝖭𝖳𝖨𝖬𝖤[2logO(1)n]). There is a constant c such that for all dc, there is no polynomial-time (resp. quasipolynomial time) algorithm which solves Range Avoidance on all 𝖳𝖰𝖡𝖥-oracle circuits of size nd.

In fact, [1] showed that computing an 𝖪𝖲-incompressible string x of length n conditional on a given string y is hard for polynomial time if 𝖯𝖲𝖯𝖠𝖢𝖤𝖭𝖯, where 𝖪𝖲 is Kolmogorov space-bounded complexity. This is easily seen to imply the result above by considering the Range Avoidance instance which takes a 𝖳𝖰𝖡𝖥-oracle program as input and evaluates it. The extension to quasipolynomial time bounds is not stated in [1] but follows immediately from the proof.

Corollary 36.

Assume 𝖲𝖧(𝖯𝖲𝖯𝖠𝖢𝖤,1), and that 𝖯𝖲𝖯𝖠𝖢𝖤𝖭𝖳𝖨𝖬𝖤[2logO(1)n]. Then there is a constant c such that for all dc, there is a polynomial-time algorithm which solves Range Avoidance on uniform 𝖳𝖰𝖡𝖥-oracle circuits of size nd but no polynomial-time algorithm which solves Range Avoidance on all 𝖳𝖰𝖡𝖥-oracle circuits of size nd.

Proof.

For the negative result on Range Avoidance, we simply apply Theorem 35. For the positive result, we apply Theorem 23 (relativized to a 𝖳𝖰𝖡𝖥 oracle) and then use the same argument as in the proof of the previous corollary.

5 Barriers to Improvements

In this section we present two barriers to improving our main results. First we show that the seed length log[O(1)]n achieved in our PRG from Section 3 is in some sense optimal for hardness/randomness approaches which use the same overall structure as our argument: specifically, we show that our argument in fact yields a construction of a special kind of seeded extractor with seed length log[O(1)]n, and prove unconditionally that this is the minimal achievable seed length in any such construction. Second we show that there is no relativizing argument that directly reduces the construction of 𝖪poly(n)-random strings to the construction of hard truth tables.

5.1 Multi-Reconstructive Extractors for CG Sources

The classical approach to turning hardness into randomness [26, 15] was famously shown by Trevisan [29] to be essentially equivalent to the task of constructing an explicit seeded extractor. Roughly speaking, Trevisan showed that any method for producing a pseudorandom generator using a hard boolean function whose correctness is proven in a sufficiently black box fashion must actually produce a seeded extractor, which treats the hard function as a high min-entropy source and the seed of the PRG as the seed in the seeded extractor.

In Section 3 a PRG was constructed against uniform algorithms (or more generally, algorithms with extremely small advice), whose seed length was much smaller than logn, using a different kind of hardness assumption applied across many different input lengths. In the section we cast such a hardness randomness construction in terms of certain seeded multi-source extractors, where each source corresponds to a hardness assumption used at a different input length. Through this connection we are able to show: (1) our construction immediately yields a seemingly new (and very simple) multi-source seeded extractor with interesting parameters, and (2) the iterated logarithmic (log[d](n), d=O(1)) seed length in our PRGs from Section 3 is necessary for any hardness/randomness tradeoff that uses the same general framework as ours, and in particular which uses “only O(1) many” hardness assumptions.

We start by recalling the proof of correctness of the PRG from Section 3, first in the case where O(loglogn) seed length is achieved using a hardness assumption at two different input lengths. We first used a hard function whose truth table had size poly(n) in order to get a PRG with seed length O(logn); call this f1. We then bounded the run time of this PRG by some T(n)=poly(n), and in the next step required a hard function f2 of truth table size logn, which was hard for algorithms with running time T(n); in particular f2 needed to be hard even in the regime where the computation of f1 is easy. For this reason, we may roughly interpret this as saying, f2 is a function (of much smaller input length) which is hard conditioned on f1. When we iterate the argument to reduce the seed length to log[d](n), we require a sequence of functions f1,,fd, of smaller and small input lengths, with fj being hard for algorithms with enough running time to compute all of f1,,fj1, i.e. which is hard conditioned on f1,,fj1. This is naturally analogous to the following well-studied information-theoretic notion of a sequence of random variables, each with high min-entropy conditioned on the previous:

Definition 37 (CG Sources [9]).

A random variable 𝒳¯=(𝒳1,,𝒳d) is said to be a CG source with d blocks and entropy sequence (k1,,kd), if for every (x1,,xd)supp(𝒳¯) and every jd,

H(𝒳j𝒳1=x1,,𝒳j1=xj1)kj

In the case j=1, this means H(𝒳1)k1.

A CG extractor is a function which can produce nearly uniform randomness from a CG source and an independent uniform seed:

Definition 38 (Seeded CG Extractors).

We say that a function E:jd{0,1}mj×{0,1}s{0,1}n is a CG extractor for entropy sequence (k1,,kd) and error ϵ if, for every CG source 𝒳¯=(𝒳1,,𝒳d) supported on jd{0,1}mj with entropy sequence (k1,,kd), we have that E(𝒳¯,𝒰s) is ϵ-close to 𝒰n in TV distance, where 𝒰s is the uniform distribution on {0,1}s (independent of the CG source) and 𝒰n is uniform on {0,1}n. The parameter s is called the seed length of the extractor.

We will now show that our PRG construction can be viewed as giving a specific kind of explicit reconstructive extractor for CG sources (with a seed), in the same sense that the standard hardness randomness constructions can be viewed as reconstructive single source seeded extractors via Trevisan’s connection. We define such reconstructive seeded CG extractors as follows:

Definition 39 (Multi-Reconstructive Extractors).

Let E:jd{0,1}mj×{0,1}s{0,1}n. We say that E is a multi-reconstructive extractor for entropy sequence (k1,,kd) and error ϵ if there are poly(m1,,md)-time algorithms R1,,Rd, each taking oracle access to some D:{0,1}n{0,1}, with the following behavior

  1. 1.

    Each Rj takes strings x1,,xj1{0,1}m1,,{0,1}mj1 and an advice string aj{0,1}kj and outputs a string in {0,1}mj (in the case j=1 the only input to R1 is a1{0,1}k1).

  2. 2.

    For every x¯jd{0,1}mj, if

    |Prz{0,1}s[D(E(x¯,z))=1]Pry{0,1}n[D(y)=1]|ϵ

    then there exists jd and some aj{0,1}kj such that RjD(x1,,xj1,aj)=xj

Note that in the case d=1, we obtain the standard definition of reconstructive single source extractors. As shown in the single-source case by Trevisan [29], we may observe that any multi-reconstructive extractor is automatically a CG extractor with similar parameters:

Lemma 40.

If E:jd{0,1}mj×{0,1}s{0,1}n is a muti-reconstructive extractor for entropy sequence (k1,,kd) and error ϵ, then it is a CG extractor for entropy sequence (2k1,,2kd) and error ϵ=ϵ+jd2kj.

Proof.

Let 𝒳¯ be a CG source with entropy sequence (2k1,,2kd). The deviation of E(𝒳¯,𝒰s) from uniform is bounded by ϵ+δ, where δ is the maximum over all distinguishers D of the probability that any of the d reconstruction procedures associated with E succeed on a random sample (x1,,xd)𝒳¯ from the source. We bound the probability for each reconstruction procedure separately and take a union bound over jd. For a fixed D and any fixing of 𝒳1=x1,,𝒳j1=xj1, there are at most 2kj values in {0,1}mj that RjD(x1,,xj1,aj) can output as we range over aj; if H(𝒳j𝒳1=x1,,𝒳j1=xj1)2kj then the probability that 𝒳j lies in this set is at most 2kj. We can now recast the central step in our main construction in Theorem 16 in the language of reconstructive extractors:

Theorem 41.

Let E1,,Ed be single source extractors Ej:{0,1}mj×{0,1}sj{0,1}nj, d=O(1), so that Ej is a reconstructive single-source extractor for min entropy kj and error ϵj and is computable in polynomial time. Say that for each j<d, we have sj=nj+1. Define E~j:jj{0,1}mj×{0,1}sj{0,1}n1 by induction on j, with E~1=E1, and E~j+1(x1,,xj+1,z)=E~j1(x1,,xj1,Ej(xj,z)). Then E~d is a multi-reconstructive extractor for entropy sequence (k1,,kd) and error jdϵd.

Proof.

We prove by induction on j that E~j is a multi-reconstructive extractor with error δj:=jjϵj. By definition it is true for E~1=E1. Now, say that E~j is a reconstructive extractor for entropy sequence (k1,,kj) and error δj. So there are reconstruction procedures R1,,Rj taking oracle access to a function D:{0,1}n1{0,1} so that, given any D{0,1}n distinguishing E~j(x¯,z) with error δ, there exists jj so that RjD(x,,xj1,a) produces xj for some a{0,1}kj. We then consider E~j+1(x1,,xj+1,z); our reconstruction procedures for R1,,Rj will be as they were for E~j+1, and for Rj+1 we use the construction procedure for the single source extractor Ej+1. Let D be given. For any x1,,xj+1 such that the first R1,,Rj reconstruction procedures all fail to reconstruct a symbol of x1,,xj+1 from its prefix using D, we know that E~j(x1,,xj+1,𝒰sj) fools D with error δj. Now define the test D{0,1}nj+1 (recall nj+1=sj), with D(z)=D(E~j(x1,,xj,z))=1. Then, for a uniformly random z{0,1}sj with have that D(z)=1 with probability in the range Prynj+1[D(y)=1]±δj. Observe that D is efficiently computable with oracle access to D and the values x1,,xj, since each of the Ej extractors are efficiently computable. Thus, using the reconstruction procedure for Ej+1, either Rj+1 succeeds in reconstructing xj+1 using oracle access to D, the previous inputs x1,,xj, and kj+1 bits of advice, or else we must have that Ej+1(xj+1,𝒰sj+1) fools D with error ϵj+1, and hence the overall construction fools D with error δj+ϵj+1, completing the proof.

Using the above in combination with explicit families of single-source reconstructive extractors (e.g. Trevisan’s extractor) we get:

Corollary 42.

For any n and fixed constants d, γ>0, there is an explicit multi-reconstructive extractor E:jd{0,1}mj×{0,1}s{0,1}n for entropy sequence (k1,,kd) and error ϵ, where:

jdmjpoly(n),kj=mjγ,sO(log[d]n),ϵ(log[d1]n)1

Moreover, kjlog[j]n for each j, and hence by Lemma 40 this construction is also a CG extractor with the same parameters stated above.

We now prove that the seed length log[d]n achieved above is optimal as a function of the output length n, regardless of how we choose the source lengths m1,,md:

Theorem 43.

Say E:jd{0,1}mj×{0,1}s{0,1}n is a CG extractor for min entropy sequence (k1,,kd), kj12mj, and error 12. Then sΩ(log[d]n).

To prove this we rely on the following lemma:

Lemma 44.

Let Ajd{0,1}mj with log|A|(jdmj)q. Then there is a CG source with entropy sequence (k1,,kd) supported on A, with kjmjqd.

Proof of Lemma 44.

We prove by induction on d; the case d=1 is trivial. Let A be given; for each x1{0,1}m1 define Ax1={(x2,xd)(x1,x2,,xd)A}, deg(x1)=|Ax1|. So x1{0,1}m1deg(x)=|A|. Define the set V{0,1}m1 by

V={x1deg(x1)exp((j>1mj)q1)}{0,1}m1

Let U={0,1}m1V; if |U|<exp(m1q1) then we must have

|A||V|maxx1Vdeg(x1)+|U|maxx1Udeg(x1) (4)
<2m1exp((j>1mj)q1)+exp(m1q1)exp(j>1mj) (5)
=exp((jdmj)q1)+exp((jdmj)q1)exp((jdmj)q) (6)

and we get a contradiction. Hence, we know that |U|exp(m1q1). For every x1U, we also know that Ax1j>1{0,1}mj has size at least exp(jmjq1), so by induction there is a CG source 𝒳x1 with min entropies (k2,,kd), kjmj(q+1)(d1)mjqd supported on Ax1. We then construct our overall distribution 𝒳 as follows: sample x1U uniformly, then sample (x2,,xd)𝒳x1 and output (x1,,xd). We know that the first coordinate of this distribution has min entropy k1:=log|U|m1q1, and that the remaining entries are a CG source with min entropy sequence (k2,,kd) conditioned on any fixing of the first coordinate; so overall 𝒳 is a CG source with entropy sequence (k1,,kd).

Proof of Theorem 43.

When d=0, E is of the form E:{0,1}s{0,1}n, so that E(𝒰s) is 12 close to uniform on {0,1}n; in this case we clearly require sn1. For d>0, we will show how to choose some jd and construct from E a second extractor E:jj{0,1}mj×{0,1}s{0,1}n using only d1 sources, so that E is a CG extractor for entropy sequence (k1,,kj1,kj+1,,kd), the same error (12), and at most exponentially larger seed length ss+2s+2+2d. This will yield the theorem by induction.

We will set j to be any index such that mj2s+2+2d; if we can find such an index, we construct E by simply moving input j of E into the seed, i.e.

E(x1,,xj1,xj+1,xd,(xj,z))=E(x1,,xj1,xj,xj+1,,xd,z)

Clearly for any choice of j, E remains a CG extractor for the contracted entropy sequence (k1,,kj1,kj+1,,kd) and the same error as E. To argue that mj2s+2+2d for some j we rely on Lemma 44 above. Let 𝒟{0,1}{0,1}n consist of all functions D:{0,1}n{0,1} which depend only on the first s+1 bits. So |𝒟|22s+1. For every x¯jd{0,1}mj, there is some Dx¯𝒟 such that

|Prz𝒰s[D(E(x¯,z))=1]Pry𝒰n[D(y)=1]|12

hence there exists a fixed D, so that for the set X¯D={x¯Dx¯=D}, we have log|X¯D|(jdmj)2s+1. We then apply Lemma 44 with q=2s+1 to conclude that there is a CG source (𝒳1,,𝒳d) supported on X¯D with min entropy sequence (k1,,kd), kj:=mj2s+1d. If kjkj for all j this will contradict the correctness of the extractor E, since E(𝒳¯,𝒰s) will be distinguished with error 12 by the test D. Hence there exists j such that kj<kj, i.e. mjkj<2s+1+d. Recalling that kj12mj, we get mj<2s+2+2d which concludes the proof.

Note that the proof yields more than just a lower bound on s:

Observation 45 (Follows from Proof of Theorem 43).

In any construction achieving the optimal seed length s=O(log[d]n), there must be some smallest source length mj1 with mj1=Ω(log[d1]n), then a second smallest source length of Ω(log[d2]n), and so on with the longest source having length on the order Ω(n)

Hence our construction uses essentially the only possible sequence of source lengths (m1,,md) which can be made to achieve a an optimal seed length of log[d]n.

5.2 No Relativizing Reduction from 𝑲𝐩𝐨𝐥𝐲 Randomness to Hard Truth Tables

In [20] it is shown that if we are permitted the use of an 𝖭𝖯 oracle in our explicit construction algorithms we have the following appealing equivalence:

Theorem 46.

The following are equivalent:

  1. 1.

    There is a polynomial time 𝖭𝖯-oracle algorithm constructing strings x{0,1}n with 𝖪poly(n)(x)n1

  2. 2.

    There is a polynomial time 𝖭𝖯-oracle algorithm constructing truth tables f:{0,1}n{0,1} with hardness 2Ω(n) (i.e. poly(2n)=2O(n)).

If this kind of equivalence could be shown without the aid of an 𝖭𝖯 oracle, it would supersede all of the main results in this paper. The proof of Theorem 46 is relativizing, and more specifically it gives a black-box reduction from the problem of constructing high Kolmogorov-complexity strings to constructing hard truth tables (the nontrivial direction). We give some indication here that an 𝖭𝖯 oracle is necessary for such an argument to work.

Observe that if f:{0,1}n{0,1} has circuit complexity o(2n/n), then for N=2n and interpreting f as an N-bit string, we have 𝖪N2(f)N/2; this is because evaluating a circuit on an input takes at most 2n=N time, and hence reconstructing f from a circuit computing it takes at most quadratic time. Hence if there were an analogue to Theorem 46 in the polynomial time regime, it would give, in particular, a reduction from constructing strings of large 𝖪nc() to strings of large 𝖪n2() complexity.

We show here that there is no such reduction which is “black box” in the same sense as Theorem 46. We first need to define the suitable notion of a black box reduction:

Definition 47.

Let c,d be fixed constants and n large. A black box reduction from 𝖪nc()-construction to 𝖪n2()-construction is an algorithm, given access to some oracle 𝒪:{0,1}{0,1}, which has the following behavior:

  1. 1.

    There is a procedure An𝒪 which, given strings y1,,ypoly(n) with ym{0,1}m, makes poly(m) additional queries to 𝒪 and outputs a string x{0,1}n

  2. 2.

    For any oracle 𝒪, if y1,,ypoly(n) are strings such that 𝖪md,𝒪(ym)m2 for each m, A𝒪(y1,,ypoly(n)) outputs a string x{0,1}n with 𝖪nc,𝒪(x)nΩ(1).

Theorem 48.

For c>d fixed constants, cd>1, there is no black box reduction from d-Avoid to c-Avoid.

Proof.

Consider the Range Avoidance instance C𝒪:{0,1}nϵ{0,1}n, defined by C𝒪(z)=(𝒪((z,1nc1,0)),,𝒪((z,1nc1,0))) where (,,) is some standard pairing function. Clearly for any oracle 𝒪 and any x such that xrange(C𝒪) we have 𝖪nc,𝒪(x)nϵ. Hence if we can find an oracle 𝒪 and supply strings y1,,ypoly(n) so that 𝖪md,𝒪(ym)m2 for each m, but A𝒪(y1,,ypoly(n)) outputs a string in range(C𝒪) then we are done. Initially we fix the value of the oracle 𝒪 to be zero on all strings of length at most nc1. We then conclude that for any m4n, we may determine a string ym so that 𝖪md,𝒪(ym)m2 is already forced by the current information about the oracle; this holds since a machine running in time mdO(nd)<nc1 cannot access the any unfixed part of the oracle. We will fix these strings y1,,y4n for the remainder of the proof.

Now, consider the algorithm 𝒜 with the first 4n arguments fixed, as a function of the remaining arguments y4n+1,,ypoly(n). By the correctness of 𝒜, for any extension of the partially defined oracle 𝒪 and any valid solutions for these remaining arguments to 𝒜, 𝒜 will find a solution to the Range Avoidance instance C𝒪. On the other hand, observe that if y4n+1,,ypoly(n) are chosen uniformly at random, then for any oracle 𝒪 the probability that any ym fails to have 𝖪md,𝒪(ym)m2 is bounded by poly(n)2m222n+o(n) since for each such m we have m4n. Hence we obtain a randomized query procedure, making poly(n) queries to the oracle 𝒪, which outputs a solution to Avoid on C𝒪 with failure probability 22n+o(n). Since we have not fixed the behavior of 𝒪 above input length nc1, C𝒪 can take on any value {0,1}nϵ{0,1}n, and is thus an arbitrary oracle-presented Range Avoidance instance. It then remains only to show that a randomized query algorithm making poly(n) queries to an Avoid instance with stretch nϵn cannot succeed with probability as high as 122n+o(n).

To obtain this randomized query lower bound, we appeal to Yao’s lemma, and consider the best success probability of a deterministic poly(n)-query algorithm 𝒬 on a uniformly random instance C:{0,1}nϵ{0,1}n. For each possible sequence of poly(n) queries we argue the probability the supplied answer is incorrect conditioned on this query sequence ocurring is is at least 2n. This holds trivially, since poly(n)<2nϵ, and hence a random C extending a fixed sequence of poly(n) mappings C(x1)=y1,, has probability at least 2n of hitting any fixed string in {0,1}n.

References

  • [1] Scott Aaronson, Harry Buhrman, and William Kretschmer. A qubit, a coin, and an advice string walk into a relational problem. In Venkatesan Guruswami, editor, 15th Innovations in Theoretical Computer Science Conference, ITCS 2024, January 30 to February 2, 2024, Berkeley, CA, USA, volume 287 of LIPIcs, pages 1:1–1:24. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2024. doi:10.4230/LIPICS.ITCS.2024.1.
  • [2] Eric Allender. The complexity of complexity. In Adam R. Day, Michael R. Fellows, Noam Greenberg, Bakhadyr Khoussainov, Alexander G. Melnikov, and Frances A. Rosamond, editors, Computability and Complexity – Essays Dedicated to Rodney G. Downey on the Occasion of His 60th Birthday, volume 10010 of Lecture Notes in Computer Science, pages 79–94. Springer, 2017. doi:10.1007/978-3-319-50062-1_6.
  • [3] Benny Applebaum, Sergei Artemenko, Ronen Shaltiel, and Guang Yang. Incompressible functions, relative-error extractors, and the power of nondeterministic reductions (extended abstract). In Proceedings of the 30th Conference on Computational Complexity, CCC ’15, pages 582–600, Dagstuhl, DEU, 2015. Schloss Dagstuhl – Leibniz-Zentrum für Informatik. doi:10.4230/LIPICS.CCC.2015.582.
  • [4] Joshua Buresh-Oppenheim and Rahul Santhanam. Making hard problems harder. In 21st Annual IEEE Conference on Computational Complexity (CCC 2006), 16-20 July 2006, Prague, Czech Republic, pages 73–87. IEEE Computer Society, 2006. doi:10.1109/CCC.2006.26.
  • [5] Lijie Chen, Shuichi Hirahara, and Hanlin Ren. Symmetric exponential time requires near-maximum circuit size. In Bojan Mohar, Igor Shinkar, and Ryan O’Donnell, editors, Proceedings of the 56th Annual ACM Symposium on Theory of Computing, STOC 2024, Vancouver, BC, Canada, June 24-28, 2024, pages 1990–1999. ACM, 2024. doi:10.1145/3618260.3649624.
  • [6] Lijie Chen and Roei Tell. Simple and fast derandomization from very hard functions: eliminating randomness at almost no cost. In STOC, pages 283–291. ACM, 2021. doi:10.1145/3406325.3451059.
  • [7] Yeyuan Chen, Yizhi Huang, Jiatu Li, and Hanlin Ren. Range avoidance, remote point, and hard partial truth table via satisfying-pairs algorithms. In Barna Saha and Rocco A. Servedio, editors, Proceedings of the 55th Annual ACM Symposium on Theory of Computing, STOC 2023, Orlando, FL, USA, June 20-23, 2023, pages 1058–1066. ACM, 2023. doi:10.1145/3564246.3585147.
  • [8] Yilei Chen and Jiatu Li. Hardness of range avoidance and remote point for restricted circuits via cryptography. In Bojan Mohar, Igor Shinkar, and Ryan O’Donnell, editors, Proceedings of the 56th Annual ACM Symposium on Theory of Computing, STOC 2024, Vancouver, BC, Canada, June 24-28, 2024, pages 620–629. ACM, 2024. doi:10.1145/3618260.3649602.
  • [9] Benny Chor and Oded Goldreich. Unbiased bits from sources of weak randomness and probabilistic communication complexity. In 26th Annual Symposium on Foundations of Computer Science (sfcs 1985), pages 429–442, 1985. doi:10.1109/SFCS.1985.62.
  • [10] Eldon Chung, Alexander Golovnev, Zeyong Li, Maciej Obremski, Sidhant Saraogi, and Noah Stephens-Davidowitz. On the randomized complexity of range avoidance, with applications to cryptography and metacomplexity. Electron. Colloquium Comput. Complex., TR23-193, 2023. URL: https://eccc.weizmann.ac.il/report/2023/193.
  • [11] Karthik Gajulapalli, Alexander Golovnev, Satyajeet Nagargoje, and Sidhant Saraogi. Range avoidance for constant depth circuits: Hardness and algorithms. In Nicole Megow and Adam D. Smith, editors, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, APPROX/RANDOM 2023, September 11-13, 2023, Atlanta, Georgia, USA, volume 275 of LIPIcs, pages 65:1–65:18. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2023. doi:10.4230/LIPICS.APPROX/RANDOM.2023.65.
  • [12] Venkatesan Guruswami, Xin Lyu, and Xiuhan Wang. Range avoidance for low-depth circuits and connections to pseudorandomness. In Amit Chakrabarti and Chaitanya Swamy, editors, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, APPROX/RANDOM 2022, September 19-21, 2022, University of Illinois, Urbana-Champaign, USA (Virtual Conference), volume 245 of LIPIcs, pages 20:1–20:21. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2022. doi:10.4230/LIPICS.APPROX/RANDOM.2022.20.
  • [13] Shuichi Hirahara. Meta-computational average-case complexity: A new paradigm toward excluding heuristica. Bull. EATCS, 136, 2022. URL: http://bulletin.eatcs.org/index.php/beatcs/article/view/688.
  • [14] Rahul Ilango, Jiatu Li, and R. Ryan Williams. Indistinguishability obfuscation, range avoidance, and bounded arithmetic. In Barna Saha and Rocco A. Servedio, editors, Proceedings of the 55th Annual ACM Symposium on Theory of Computing, STOC 2023, Orlando, FL, USA, June 20-23, 2023, pages 1076–1089. ACM, 2023. doi:10.1145/3564246.3585187.
  • [15] Russell Impagliazzo and Avi Wigderson. P = bpp if e requires exponential circuits: derandomizing the xor lemma. In Proceedings of the Twenty-Ninth Annual ACM Symposium on Theory of Computing, STOC ’97, pages 220–229, New York, NY, USA, 1997. Association for Computing Machinery. doi:10.1145/258533.258590.
  • [16] Russell Impagliazzo and Avi Wigderson. Randomness vs. time: De-randomization under a uniform assumption. In 39th Annual Symposium on Foundations of Computer Science, FOCS ’98, November 8-11, 1998, Palo Alto, California, USA, pages 734–743. IEEE Computer Society, 1998. doi:10.1109/SFCS.1998.743524.
  • [17] Aayush Jain, Huijia Lin, and Amit Sahai. Indistinguishability obfuscation from well-founded assumptions. In Samir Khuller and Virginia Vassilevska Williams, editors, STOC ’21: 53rd Annual ACM SIGACT Symposium on Theory of Computing, Virtual Event, Italy, June 21-25, 2021, pages 60–73. ACM, 2021. doi:10.1145/3406325.3451093.
  • [18] Robert Kleinberg, Oliver Korten, Daniel Mitropolsky, and Christos H. Papadimitriou. Total functions in the polynomial hierarchy. In James R. Lee, editor, 12th Innovations in Theoretical Computer Science Conference, ITCS 2021, January 6-8, 2021, Virtual Conference, volume 185 of LIPIcs, pages 44:1–44:18. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2021. doi:10.4230/LIPICS.ITCS.2021.44.
  • [19] Adam R. Klivans and Dieter van Melkebeek. Graph nonisomorphism has subexponential size proofs unless the polynomial-time hierarchy collapses. In Proceedings of the Thirty-First Annual ACM Symposium on Theory of Computing, STOC ’99, pages 659–667, New York, NY, USA, 1999. Association for Computing Machinery. doi:10.1145/301250.301428.
  • [20] Oliver Korten. The hardest explicit construction. In 62nd IEEE Annual Symposium on Foundations of Computer Science, FOCS 2021, Denver, CO, USA, February 7-10, 2022, pages 433–444. IEEE, 2021. doi:10.1109/FOCS52979.2021.00051.
  • [21] Oliver Korten and Toniann Pitassi. Strong vs. weak range avoidance and the linear ordering principle. In 65th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2024, Chicago, IL, USA, October 27-30, 2024, pages 1388–1407. IEEE, 2024. doi:10.1109/FOCS61266.2024.00089.
  • [22] Ming Li and Paul M. B. Vitányi. An Introduction to Kolmogorov Complexity and Its Applications, 4th Edition. Texts in Computer Science. Springer, 2019. doi:10.1007/978-3-030-11298-1.
  • [23] Zeyong Li. Symmetric exponential time requires near-maximum circuit size: Simplified, truly uniform. In Bojan Mohar, Igor Shinkar, and Ryan O’Donnell, editors, Proceedings of the 56th Annual ACM Symposium on Theory of Computing, STOC 2024, Vancouver, BC, Canada, June 24-28, 2024, pages 2000–2007. ACM, 2024. doi:10.1145/3618260.3649615.
  • [24] Zhenjian Lu and Igor C. Oliveira. Theory and applications of probabilistic kolmogorov complexity. Bull. EATCS, 137, 2022. URL: http://bulletin.eatcs.org/index.php/beatcs/article/view/700.
  • [25] Rajeev Motwani, Joseph (Seffi) Naor, and Moni Naor. The probabilistic method yields deterministic parallel algorithms. Journal of Computer and System Sciences, 49(3):478–516, 1994. 30th IEEE Conference on Foundations of Computer Science. doi:10.1016/S0022-0000(05)80069-8.
  • [26] Noam Nisan and Avi Wigderson. Hardness vs randomness. Journal of Computer and System Sciences, 49(2):149–167, 1994. doi:10.1016/S0022-0000(05)80043-1.
  • [27] Hanlin Ren, Rahul Santhanam, and Zhikun Wang. On the range avoidance problem for circuits. In 63rd IEEE Annual Symposium on Foundations of Computer Science, FOCS 2022, Denver, CO, USA, October 31 – November 3, 2022, pages 640–650. IEEE, 2022. doi:10.1109/FOCS54457.2022.00067.
  • [28] Ronen Shaltiel and Christopher Umans. Simple extractors for all min-entropies and a new pseudorandom generator. J. ACM, 52(2):172–216, March 2005. doi:10.1145/1059513.1059516.
  • [29] Luca Trevisan. Extractors and pseudorandom generators. J. ACM, 48(4):860–879, July 2001. doi:10.1145/502090.502099.
  • [30] Christopher Umans. Pseudo-random generators for all hardnesses. In Proceedings of the Thiry-Fourth Annual ACM Symposium on Theory of Computing, STOC ’02, pages 627–634, New York, NY, USA, 2002. Association for Computing Machinery. doi:10.1145/509907.509997.

Appendix A Proof of Lemma 14

We prove here Lemma 14, which we restate below for convenience:

Lemma (Lemma 14, Restated).

Assume Hypothesis 𝖶𝖧(𝒪,d+1,v), d0, v2. Then for every time constructible exp[d](n)T(n)poly((exp[d](n)) there is a PRG (Gn:{0,1}s(n){0,1}n)n computable uniformly in time Od+v(T(n)) which fools 𝖳𝖨𝖬𝖤𝒪[T(n)]/3n and has seed length s(n)Ov1(logn).

For this we require a strengthening of Theorem 15 due to Umans [30] (following a similar result of Shaltiel-Umans [28]) given as follows:

Theorem 49 ([30]).

There is a fixed universal constant γ>0 so that the following holds. Let h: be any time-constructible “hardness” parameter, h(n)2n. On input n and given oracle access to a function f:{0,1}n{0,1}, 𝖴𝖬f computes a function 𝖴𝖬f:{0,1}s(n){0,1}m(n) in 2O(n) time for some time constructible m(n)=Θ(h(n)γ), s(n)O(n), such that for any D:{0,1}m(n){0,1} with

|Prx{0,1}m(n)[D(x)=1]Prz{0,1}s(n)[D(𝖴𝖬f(z))=1]|1m

there exists a circuit C with D oracle gates computing f whose total size is at most h(n).

Proof of Lemma 14.

Assuming Hypothesis 𝖶𝖧(𝒪,d+1,v), we conclude that for some ϵ>0, there is a language in 𝖳𝖨𝖬𝖤[exp[d+1](n)] which is not in 𝖳𝖨𝖬𝖤𝒪[Φv,ϵ(exp[d+1](n))]/Φv,ϵ(2n) even infinitely often. Let h(n)=Φv,ϵ2(2n), and define G~n:{0,1}s(n){0,1}m(n) by Gn=𝖴𝖬Ln where s(n)O(n),m(n)=Θ(h(n)γ) are the time constructible bounds guaranteed in Theorem 49. By Theorem 49, if G~n is distinguished by some D:{0,1}m(n){0,1} then there is an oracle circuit of total size h(n) computing Ln; hence if a distinguisher for G~n exists which is computable in time t(n) with an 𝒪 oracle and O(m(n)) bits of advice for infinitely many n, then L is decidable with an 𝒪 oracle in time t(n)h(n) with O(m(n))+h(n)=O(h(n)) bits of advice for infinitely many n.

Let T(n)=(exp[d](n))k for some constant k. We then define our final generator G in terms of G~ by a simple reparameterization of input lengths; for n, we determine the least n~ such that m(n~)n and T(n)h(n~)<Φv,ϵ(exp[d+1](n~)) and set Gn=G~n~ (truncating the output length of G~n~ as necessary). We want to show that Gn fools time T(n) algorithms using an 𝒪 oracle and O(n) bits of advice, and runs in time Od+v(T(n)) with seed length Ov1(logn). For the first point, observe that for any D:{0,1}n{0,1} running in time T(n) with O(n) bits of advice, we obtain a computation of Ln~ running in time T(n)h(n~) time with O(h(n~)) bits of advice, which contradicts our hardness assumption provided T(n)h(n~)<Φv,ϵ(exp[d+1](n~)) and O(h(n~))<Φv,ϵ(2n~) which both hold by construction.

On the other hand the runtime and seedlength of Gn are bounded by 2O(n~)exp[d+1](n~) and O(n~) respectively, so it remains to bound the growth rate of n~. Recall that we chose n~ to be the least integer such that m(n~)n and T(n)h(n~)<Φv,ϵ(exp[d+1](n~)), where m(n~)=Θ(h(n~))γ. For the first point, we have h(x)2xT(x) hence we can bound n~ by the least integer satisfying T(n)2<Φv,ϵ(exp[d+1](n~)), i.e. (exp[d](n))2k<Φv,ϵ(exp[d+1](n~)), so it suffices here to take n~=Ov1(logn). On the other hand, to have m(n~)n it suffices to have h(n~)n, i.e. Φv,ϵ2(2n~)n, so setting n~Ov1(logn) suffices. Hence in the end we obtain a runtime of exp[d+1](Ov1(logn))Od+v(T(n)) and seed length Ov1(logn).

We will use a slight generalization of Lemma 14 that follows directly by padding (the difference compared to the previous lemma is merely that we allow a slightly more general upper bound on T(n)):

Corollary 50.

Assume Hypothesis 𝖶𝖧(𝒪,d+1,v), d>0, v2 and let d be a fixed constant. Then for every time constructible exp[d](n)T(n)Od+v(exp[d](n)) there is a PRG (Gn:{0,1}s(n){0,1}n)n computable uniformly in time Od+v(T(n)) which fools 𝖳𝖨𝖬𝖤𝒪[T(n)]/3n and has seed length s(n)Ov1(logn).

Proof.

Using Lemma 14, from our hardness assumption we get a generator fooling 𝖳𝖨𝖬𝖤𝒪[exp[d](n)]/3n with runtime Od+v(T(n)) and seed length s(n)Ov1(logn). On input length n, we choose some n=n(n) such that exp[d](n)T(n) and apply our generator on input length n; clearly we may use this generator for input length n as well, by considering a language L{0,1}n as L{0,1}n depending on only the first nn bits. It suffices to take n=Ov(n), in which case the seed length as a function of n is Ov1(logOv(n))Ov1(logn) and the runtime is Od+v(exp[d](Ov(n)))Od+v(exp[d](n))=Od+v(T(n)).

We are now ready to prove Theorem 18:

Theorem (Theorem 18, Restated).

Assume Hypothesis 𝖶𝖧(𝒪,d+1,v) with d0, v2, fixed constants, and k a fixed constant. Then there is a pseudorandom generator with seed length Ov1(log[d+1](n)) and runtime Od+v(n) that fools 𝖳𝖨𝖬𝖤𝒪[nk]/log[d](n) with error (2log[d](n))1 whenever 𝖪nk(n)log[d](n).

Proof.

Let k1, d0, v2 be fixed constants and let T(n)=nk. We prove by induction on d that there is a pseudorandom generator (𝒢nd:{0,1}sd(n){0,1}n)n which runs in time Od+v(n) and fools 𝖳𝖨𝖬𝖤𝒪[nk]/log[d](n) with error jd(log[j](n))1 and has seed length sd(n)Ov1(log[d+1](n)), provided the input length n satisfies 𝖪nk(n)log[d](n).

In the case d=0 we may apply Lemma 14 directly. Now, say that the generator (𝒢nd1:{0,1}sd1(n){0,1}n)n is given, computable in time Td1(n):=Od1+v(n) with sd1(n)Ov1(log[d]n) and fooling 𝖳𝖨𝖬𝖤𝒪[nk]/log[d1](n) with error jd1(log[j](n))1. Let td be a time constructible function such that Td1(n)3td(sd1(n)); we may set td=Od1+v(exp[d](n)). Now, let (Gn:{0,1}s(n){0,1}n)n be the generator guaranteed by Corollary 50 for time bound td(); so Gn fools 𝖳𝖨𝖬𝖤𝒪[td(n)]/3n, runs in time td(n)=Od+v(td(n)), and has error n1 and seed length s(n)=Ov1(logn). We then set 𝒢nd to be the generator with seed length sd(n)=s(sd1(n)) given by 𝒢nd(z)=𝒢nd1(Gsd1(n)(z)). The run time of 𝒢nd is bounded by some constant times the sum of the run times of the two constituent generators, which overall is bounded by Od1+v(n)+O(td(sd1(n))). Let L{0,1}n be decided by a 𝖳𝖨𝖬𝖤𝒪[nk]/log[d](n) machine; recall that n=exp[d1]() for some . Define L{0,1}sd1(n) given by L={z𝒢nd1(z)L}; since 𝒢nd1 fools L with error ϵ:=jd1log[j](n), we have that Pr[yL]Pr[zL]±ϵ. On the other hand L is decidable in time 3td(sd(n)) with 2log[d](n)+O(1) bits of advice provided n is of the form 𝖪nk(n)log[d](n): we use the advice for deciding L, together with log[d](n) bits of advice describing the number n, and O(1) advice to specify the code for 𝒢d1 and the procedure explained herein. Hence Gsd1(n) fools L to within error δ:=(sd1(n))1(log[d](n))1, so overall we must have that 𝒢nd=𝒢nd1Gsd1 fools L to within error ϵ+δjd(log[j](n))1. It remains to bound the seed length and runtime of 𝒢nd. The seed length is Ov1(logsd1(n)), and sd1=Ov1(log[d](n)), so overall the seed length is bounded by Ov1(log[d+1](n)). On the other hand the runtime is dominated by td(sd1(n))=Od+v(exp[d](sd1(n)))=Od+v(exp[d](Ov1(log[d](n))))Od+v(n).