How to Construct Random Strings
Abstract
We address the following fundamental question: is there an efficient deterministic algorithm that, given , outputs a string of length that has polynomial-time bounded Kolmogorov complexity or even ?
Under plausible complexity-theoretic assumptions, stating for example that there is an for which for appropriately chosen time-constructible , we show that the answer to this question is positive (answering a question of [27]), and that the Range Avoidance problem [18, 20, 27] is efficiently solvable for uniform sequences of circuits with close to minimal stretch (answering a question of [14]).
We obtain our results by giving efficient constructions of pseudo-random generators with almost optimal seed length against algorithms with small advice, under assumptions of the form mentioned above. We also apply our results to give the first complexity-theoretic evidence for explicit constructions of objects such as rigid matrices (in the sense of Valiant) and Ramsey graphs with near-optimal parameters.
Keywords and phrases:
Explicit Constructions, Kolmogorov Complexity, DerandomizationCopyright and License:
2012 ACM Subject Classification:
Theory of computation Complexity theory and logicAcknowledgements:
The authors would like to thank Hanlin Ren for many useful discussions.Editors:
Srikanth SrinivasanSeries and Publisher:
Leibniz International Proceedings in Informatics, Schloss Dagstuhl – Leibniz-Zentrum für Informatik
1 Introduction
1.1 Motivation
Constructing Random Strings.
Time-bounded Kolmogorov complexity [22] is a fundamental measure of complexity of a Boolean string. Given a string and a time bound , is the size of the smallest program such that outputs within steps, where is a universal Turing machine fixed in advance. Intuitively, for feasible time bounds , measures the inherent compressibility or “structure” in from the point of view of efficient algorithms, in the sense that any with low complexity has a short description from which it can be recovered efficiently. Thus a string with high complexity can be considered unstructured or random-like.
Given functions and a sequence of strings, where each is of length , let us call the sequence -hard if for all . We are interested in the complexity of producing a -hard sequence of strings, for polynomial time bounds and for as large as possible. Three particularly important settings we focus on here will be , , and . A simple argument shows that -hard strings cannot be produced in time by an algorithm which outputs when given in unary. Indeed, the existence of such an algorithm would imply that , as the universal machine can output in less than time when given in binary and a constant-size program for 111Here we are using the standard fact that any algorithm running in time can be simulated by the universal machine in time .. But is it possible for such a sequence to be produced by an algorithm running in time ?
Question 1: Let be any time-constructible function and let . Is there a deterministic algorithm , which given input , runs in time and outputs a -hard string?
Question 1 is a fundamental question about Kolmogorov complexity, for which we currently do not have clear positive or negative complexity-theoretic evidence. A positive answer to Question 1 would be very interesting, as it would demonstrate the power of having polynomially more time: a time algorithm would be capable of producing highly -incompressible strings, while a time algorithm can only produce strings that are highly -compressible. Indeed Hypothesis 1.19 in a recent work of [27] asks a very similar question222The difference is that instead of only asking about producing -hard strings for time-constructible functions , [27] asks for an algorithm that gets and as input in unary, and outputs a -hard string of length . to Question 1, and states that “The plausibility of Hypothesis 1.19 remains to be investigated”. Question 1 also has implications for the theory of meta-complexity, which has seen much recent work [2, 13, 24].
A natural approach to Question 1 is via pseudo-randomness. Under the assumption that requires non-deterministic circuits of size for some constant , it is known [15, 19] that for any constant there is a pseudo-random generator with seed length and error against co-non-deterministic circuits of size , computable in time . Consider , where is any constant. The property of being a -incompressible string, i.e., a length- string of -complexity , can be checked by a co-non-deterministic circuit of size at most , and moreover at least half of all strings of length satisfy the property. Hence close to half of the outputs of satisfy the property of -incompressibility, which means that the range of is a size set computable in time and containing at least one -incompressible string for large enough . However, it is unclear how to identify efficiently which of the strings in is incompressible - this seems to require calls to an oracle. We could try and combine the strings in together into a string that is -hard, for example by concatenating them. This approach has two drawbacks:
-
1.
Concatenating the strings yields a new string of length rather than length , since the set has size . Hence for any reasonable setting of , even , we do not get any interesting consequences for -hardness when we are interested in time bounds as a function of input length.
-
2.
Even if we don’t mind destroying the relation between the time bound and input length, concatenating strings which are maximally incompressible can, at best, achieve incompressibility . If we are in the more stringent compressibility settings or which occur more frequently in explicit construction applications, concatenating many strings appears to be useless.
Like most of the interesting problems in derandomization, answering Question 1 unconditionally would be difficult, as any -hard string for can be transformed easily into the truth table of a hard Boolean function in , i.e., a function with circuit complexity at least for some , and hence would imply . Instead, we would simply like complexity-theoretic evidence in favor of or against Question 1. Preferably, this evidence should avoid exotic assumptions, and involve standard beliefs about the difficulty of simulating one resource efficiently by another, e.g., the belief that time cannot be simulated by much smaller space or by much smaller non-uniformity.
Explicit Constructions and Range Avoidance.
Recently there has been a lot of interest [20, 18, 27, 12, 14, 11, 10, 7, 8, 5, 23, 21] in the Range Avoidance problem, where the input is a circuit from bits to bits, where , and the task is to find a non-output of . This task is efficiently doable with randomness - we can just output a random string of length , and this will be a non-output of with probability at least . The question is whether this can be done efficiently deterministically.
The problem was motivated in [20, 18] by its applications to explicit constructions of combinatorial objects. Let be a property, such as the property of a graph being Ramsey or of a matrix being rigid, that is satisfied by a random object with probability very close to . For many natural such , including the Ramsey and Rigidity properties, it can be shown that objects not satisfying can be efficiently recovered from a compressed representation. This recovery process can be modeled by a small Boolean circuit for which any non-output is the representation of an object satisfying . In other words, solving Range Avoidance on enables an efficient explicit construction for .
Given that explicit constructions for properties such as Ramsey and Rigidity have been long sought after, this motivates the study of the complexity of Range Avoidance, and in particular the search for evidence that Range Avoidance is feasible. Unfortunately it was shown in [14] that under plausible cryptographic and complexity-theoretic assumptions, Range Avoidance is intractable.
However the intractability of Range Avoidance doesn’t give evidence for the intractability of the explicit construction problems mentioned above - it could be that a weaker assumption than solving Range Avoidance is sufficient for efficient explicit constructions. Indeed such an assumption was identified in [14] - solving Range Avoidance for uniformly generated circuits. It turns out that all of the explicit construction problems studied in [20] can be captured by Range Avoidance on uniform circuits. This now raises the question of how tractable it is to solve Range Avoidance in this special case.
Question 2: Is there compelling complexity-theoretic evidence in favour of efficient explicit constructions for properties such as Ramsey and Rigidity, and more generally in favour of the tractability of Range Avoidance on uniform circuits?
Indeed the tractability of Range Avoidance on uniform circuits is raised explicitly in [14]. It is also stated there that “some of the authors are skeptical that this special case of Avoid is easy.”
The pseudorandomness approach to Question 1 can be applied to Question 2 as well. Essentially the same argument as we used there gives that, for properties such as Ramsey and Rigid, under standard derandomization assumptions, there is an efficient construction of a list of objects at least one of which satisfies the property. However, as we do not know polynomial-time algorithms for verifying the Ramsey or Rigidity properties, it is unclear how to pick out a single object satisfying the property from such a list. Indeed the pseudo-randomness approach applies to general Range Avoidance as well, but in that case we now have evidence that the problem is not tractable.
1.2 Our Results
We give positive answers to Questions 1 and 2, under believable complexity assumptions. Our hardness assumptions are of the following two forms, each involving a parameter :
Hypothesis (Hardness Assumptions for -Exponential Time Bounds).
-
1.
(Strong Form) There is an absolute constant so that the following holds. For any time constructible , with having at most -order exponential growth rate and , there is a function , computable in time, which cannot be computed in time with bits of advice by any -oracle machine for more than finitely many .
-
2.
(Weak Form) Let . There is some and a language computable in time which is not computable in time with an oracle and bits of advice even infinitely often.
We are using for the -fold iteration of the exponential function . Such assumptions, particularly of the second kind, are fairly standard in complexity theory. Indeed, when , the second assumption is a standard derandomization assumption, asserting that there is a language in which does not have oracle circuits of size . The main novelty in our case is that we use these assumptions for larger values , in particular for time bounds which have iterated-exponential growth rate. While we state the strong form assumption by quantifying over all time constructible , in fact we only require the assumption to hold for quite a limited class of bounds: the bounds needed are representable by constant length arithmetic expressions with , and ceiling/floor functions, while the bounds needed are simply of the form for constants .
The first assumption is analogous to assumptions made in the recent literature on hardness vs randomness [6], and generalizes the second assumption in a natural way. Both the first and second assumptions involve natural beliefs about the relationships between the fundamental resources of time, non-determinism and advice. Both assumptions formalize the intuition that time cannot be sped up by an arbitrary polynomial amount using non-determinism and advice. Indeed, note that there is no known way to speed up time even by a super-constant amount by using non-determinism and advice.
We believe that our assumptions hold relative to a random oracle, and here is an informal argument for the case of the second assumption. Given oracle , define the time bounded Turing machine to accept of length if the ’th string of length is in (where we consider the lexicographic order on strings). For a random oracle , the truth table of at length is Kolmogorov-random even conditioned on truth tables below length . And time non-deterministic machines can only access strings of at length or below, so it should not be possible to compute the first bits of at length from this information together with arbitrary bits of advice.
Our approach to using these assumptions in service of constructing random strings is via pseudo-randomness, and is a variation on the strategy we critiqued a few paragraphs ago: construct a pseudorandom generator, enumerate its range, and concatenate the constituent strings. The problems with this approach discussed in the last section boiled down to one issue: for a standard complexity-theoretic PRG fooling polynomial size nondeterministic circuits, we can at best achieve a seed length of , and hence will end up needing to concatenate a very long list of strings. To remedy the situation, it would suffice to use pseudorandom generators with much smaller seed length of or even . While it is known that seed length is information-theoretically optimal for fooling nonuniform circuits of size , we aim to exploit the uniformity of the adversary we are fooling to get seed length as small as possible; for uniform adversaries the lower bound of no longer applies. This yields a result which is interesting in its own right: efficient constructions of small pseudo-random sets for uniform (or even slightly non-uniform) algorithms, under assumptions of the form discussed in the previous paragraph.
Theorem 1 (Informal, see Theorems 16 and 18).
Under the Hypotheses above with , there is a pseudorandom generator with seed length which fools ; if we assume the strong form the generator runs in polynomial time, and if we assume the weak form it runs in quasipolynomial time.
More generally, under the above hardness assumption with large parameters (i.e. for higher-exponential time bounds), we obtain pseudorandom generators fooling with seed length (for infinitely many ). If we assume the strong form hypothesis our generators will run in polynomial time, and if we assume the weak form they will run in iterated quasipolynomial time. When , our construction will require in addition that the input length is of the form for some , or more generally that .
Some terms in the above require clarification. We are using to denote the -fold iteration of the logarithmic function. Our notion of “iterated quasipolynomial time” will be defined in the preliminaries - it describes a family of runtime bounds which grows faster than polynomial and quasipolynomial, but is much closer to polynomial than to exponential. Finally, the notation denotes the time bounded Kolmogorov complexity of the number , when written as a string in standard binary notation.
Our PRG construction can actually handle advice (nonuniformity) up to when the seed length is , which we observe is optimal via a standard argument (Lemma 22). It also holds for fooling larger time bounds for (at the cost of the time complexity of the generator being low as a function of rather than of ), and for other oracles in place of the oracle (provided we use in place of in the hardness assumption).
We then turn our attention briefly to PRGs fooling uniform algorithms without oracles: what is the minimal seed length such that we can fool by a PRG with running time ? For this problem we show in Theorem 20 that the machinery developed above is unnecessary: if the PRG is capable of simulating the distinguishers it is trying to fool, there is a straighforward method which can reduce seed length to arbitrarily small seed length. Hence under the standard assumptions that achieve logarithmic seed length for (namely that requires exponential circuit size) we can obtain an arbitrarily strong reduction in seed length for . Like our -oracle result, this result applies to distinguishers with mild nonuniformity, and the achievable seed length scales with the uniformity in a way that is provably optimal according to Lemma 22.
Random Strings and Explicit Constructions.
We next use our short-seed pseudorandom generators for (Theorem 1) to give (conditional) constructions of -random strings. Our first construction of random strings is the following, which achieves incompressibility :
Theorem 2 (Informal, see Theorems 23 and 24).
Under our main hardness assumption with , for any there is an algorithm such that outputs an -bit string satisfying for all ; the algorithm will run in polynomial time if we use the strong form of the assumption, and quasipolynomial time if we use the weak form.
More generally under our main hardness assumption for larger values of , we obtain constructions of strings with . Using the strong form of the assumption the construction runs in polynomial time, and under the weak form it runs in iterated quasipolynomial time. When we require the integer to be of the form for some integer , or more generally to satisfy .
If we are satisfied with our construction algorithms succeeding on some unknown infinite set of input lengths, we are able to bootstrap the above constructions so that the loss in -complexity becomes additive rather than multiplicative:
Theorem 3 (Informal, see Theorem 25).
Assume the first version of our main hardness assumption for all . Then for any there is a polynomial time algorithm such that outputs an -bit string satisfying for infinitely many .
In Section 4.2 we then use the previous results to give conditional polynomial time explicit constructions of various important combinatorial objects shown reducible to Range Avoidance in [20]. We will defer most of the specifics to Section 4.2, but highlight the following two results:
Theorem (See Theorems 28 and 29).
-
1.
Under the strong form of our main hardness assumption with , there is a time algorithm which produces a matrix which is Valiant-Rigid whenever is a power of 2.
-
2.
Assuming that there is a language in that is not computable in (even infinitely often)333This is a particular case of the weak form hardness assumption., there is a language in that requires Boolean circuits of size for all .
Crucially, our results give the first standard-form hardness assumptions under which the above explicit construction problems (and others in Section 4.2) have an efficient algorithm, and they do so via a universal approach: we have a single object, -random strings, whose efficient explicit construction follows from plausible hardness assumptions, and which automatically satisfies various other pseudorandom properties, e.g. rigidity as a matrix or maximal hardness as a Boolean function.
Further Applications.
In Sections 4.3 and 4.4 we discuss two further applications of our random string constructions. The first shows a “hardness condensation” phenomenon for the main hardness assumptions we use in this paper (with the lower bound relativized to a oracle), whereby we may amplify the degree of nonuniformity in the lower bounds from to . The second addresses the question of the relative difficulty of uniform range avoidance and general (nonuniform) range avoidance posed in [27, 14], where we observe using our main results and some previous hardness results ([14, 1]) that under plausible hardness assumptions, nonuniform range avoidance is significantly harder than the uniform variant.
Barriers to Improving Our Main Results.
In the final section we give two sets of results indicating barriers to improving our main PRGs and random string constructions. Our first result casts our iterated-logarithmic seedlength PRG fooling (Theorem 1) as a reconstructive multi-source extractor, similar to Trevisan’s analysis [29] of the classical hardness randomness connection in [26, 15] in terms of reconstructive single-source extractors. We show that any such extractor must have iterated-logarithmic seed length, and thus that any improvement to the seed length of our PRG in Theorem 1 must use substantially different techniques.
Second, we give an oracle separation showing that it is not possible via black-box arguments to obtain our main results from more standard assumptions like . In [20] it is shown that explicit construction of random strings and hard truth tables are equivalent with respect to oracle reductions. If such an equivalence held without oracles it would supersede our main results here: we could start with an assumption that requires -size circuits, and then use the reduction to produce from the hard truth table a string with high time bounded Kolmogorov complexity. We show that no such reduction exists which is relativizing. This justifies in some sense the need for more high-end hardness assumptions to achieve our main results.
1.3 Overview of Main Construction
We describe at a high level our construction of pseudorandom generator with small seed length secure against any computable in . Using the classical hardness-randomness connection, under the assumption that requires exponential size -oracle circuits we can obtain a generator with seed length . The key observation is that this first attempt at a generator significantly overshoots our primary goal in one respect: the generator obtained from the generic hardness-randomness connection would in fact be secure against machines which have access to bits of advice. On the other hand we only want a generator which fools uniform algorithms.
To use this to our advantage, we consider a recursive argument. Let , and consider as our new input length. Define the language consisting of the seeds of such that . We know that by the security of our first generator , and therefore if we could find a second generator which fools the language , we would find that the composition fools the original language . If we would have made significant progress. For such a generator to fool , we would need it to fool -oracle algorithms with running time ; however the input length of is , so as a function of the new input length we need to fool algorithms running in exponential time with an oracle. We may proportionally increase the allowable runtime of to exponential as well; so overall we have scaled the time complexities of our generator/distinguisher by an exponential. However, crucially, the advice complexity of deciding remains : this is where we use crucially the uniformity of our distinguisher. If required bits of advice compute, then the best advice upper bound we could place on is which is trivial.
Continuing in this way, we are able to iteratively reduce the seed length of our original generator by a logarithmic composition, for an arbitrary constant number of phases. The cost is that we need to obtain, in each step, psuedorandom generators with logarithmic seed length that fool algorithms running in time with an oracle, and which are computable in a comparable amount of time (without an oracle). To achieve such generators we rely on the standard toolkit of hardness-randomness transformations [26, 15, 30]. In this way, we can argue the security of our construction based on the kind of hardness assumptions described above.
There are a few additional intricacies that we are skimming over here, which contribute to the two caveats in our main theorem statements: the weaker runtime bounds when using the second version of our hardness assumptions, and the requirement that the number is of the form for some when using the first version of our hardness assumptions. The first of these issues roughly stems from the fact that, when trying to apply our hardness assumption to get an seed length generator on input length (which we will then compose with the outer generator to reduce the total seed length by another logarithm), we will actually need to apply our hardness assumption on some other input length ; this will mean that we can at best hope for our generator to run in time , which will be superpolynomial when . This issue can be avoided by considering the more refined first version of our hardness hypothesis.
For the second issue, it turns out that we can only guarantee the security of our generator on inputs lengths such that the number itself has time-bounded Kolmogorov complexity . For this changes nothing, since every number has via its binary representation, so we achieve a generator with seed length which is valid on all input lengths, but for large values of our construction is only valid for infinitely many (in particular it will work for all of the form for some ). The reason for this issue is as follows: in the above exposition, we considered the language of seeds mapping to strings in the base language , and argued it is computable in time exponential in its input length. When we continue this argument for more steps and obtain languages , we would like to argue at each step that they are uniformly computable with a small amount of advice (comparable to the input length they are defined on). However, the definition of these languages depends on the original input length that we started on, and since is not injective for any , we cannot determine solely from the smaller input lengths, but will need to somehow supply the number as advice. If we assume the number is sufficiently compressible, we are able to skirt this issue and complete the argument.
1.4 Related Work
We have already mentioned connections of our work to meta-complexity and Range Avoidance. In terms of derandomizing uniform algorithms, there has been a lot of work on uniform hardness-randomness tradeoffs, beginning with [16], but much of that work is not relevant to our setting as the emphasis is on uniform assumptions for derandomization, while we are interested instead in decreasing the seed length, and are not concerned with uniformity of the assumption.
However the recent work of [6] on super-fast derandomization is indeed very relevant from a technical point of view. They give a (conditional) derandomization of into . The classical approach to derandomizing would be to model it as a circuit of size with input length (the input represents the randomness) and use a PRG fooling this class. To fool this class of circuits requires a PRG with seed length and running time , and if we could therefore not hope to achieve derandomization in time no matter how fast the PRGs runtime is. A key observation in [6] is that that this approach to derandomizing actually overshoots whats required: if we model the machine more precisely as a machine on input length with nonuniformity , then we only need a PRG which fools ; for this task, it is possible (at least information-theoretically) to achieve a seed length of or more generally rather than , which means that the enumeration of seeds will only cost us time.
They then proceed to construct a PRG for with seed length and computable in time . In their case a great deal of work must be dedicated to optimizing the runtime of the PRG which has no relevant parallel in our work; however the question of reducing the seed length down to the true level of uniformity, and the assumptions under which they are able to achieve it, have similarities to ours. In particular, they require assumptions of the form where for a large constant .
The innovation in our work is the use of recursion to achieve dramatically smaller seed length against inefficient uniform adversaries using a small amount of non-uniformity, which enables us to address Questions 1 and 2.
2 Preliminaries
We start with some notation for the classes of growth rates appearing in this work:
Definition 4 (Growth Rates).
Both are base 2. For any function , we use to denote the -ary composition of with itself, with being the identity . We say that a function has elementary growth rate, or that it is “elementary,” if for some absolute constant . We say that a funciton is time-constructible if there is a Turing machine such that prints in binary, and has running time on all inputs.
For each and we define the function . Using this function we define a hierarchy of notations more relaxed than the standard as follows; for , we say if for some and sufficiently large , and if for some and sufficiently large . , are defined analogously.
In this notation we have , , . This hierarchy of growth rates enumerates a class of strongly subexponential growth rates having magnitude much closer to than to : indeed for each fixed , is closed under composition and grows slower than for any fixed . Similarly, , , , and for each fixed the class has a growth rate much closer to than to , and in particular far exceeds the growth rate of for any fixed .
Definition 5 (Languages and Complexity Classes).
For functions and language we define the complexity class consisting of those languages decidable in time with an oracle for the language and using bits of advice on inputs of length . When we omit this argument.
For a language we use to denote the restriction of to , and to denote the set of languages so that is computed by Boolean circuits of size for all .
We next define our notation for time-bounded Kolmogorov complexity:
Definition 6 (Kolmogorov Complexity).
We fix an efficient universal oracle turing machine once and for all. For an oracle language , time bound , we use to denote the length of the shortest program so that halts with output in at most time steps. If or is the empty string we omit them from the notation as in , respectively.
For a number , we use to denote where is the canonical binary representation of .
Finally we introduce relevant notation for pseudorandom generators:
Definition 7 (Pseudorandom Generators).
For a family of “distinguishers” and , we say that is a pseudorandom generator (PRG) secure against with error if, for all , we have
We say that is a hitting set generator (HSG) if it satisfies the weaker condition
Let , . If is a complexity class (set of languages) and is an ensemble of generators, we say that is a PRG (resp. HSG) secure against if, for all , there is so that for all , is a PRG (resp. HSG) secure against with error .
2.1 Range Avoidance and Construction of Random Strings
We formalize a general notion of explicit construction problems as follows:
Definition 8 (Explicit Construction Problems).
An explicit construction problem is defined by a language , such that for all . The computational task is: given as input, output a string .
If is an explicit construction problems, we say that a function is a “list solution” to if for all . The “list size” is defined as .
As discussed in the introduction, an important class of explicit construction problems are those reducible to the problem range avoidance:
Definition 9 (Range Avoidance [18, 20, 27]).
Range avoidance, or “Avoid,” is the following search problem: given a Boolean circuit with , output a string . We say that this Avoid instance has “stretch” .
We say that an explicit construction problem reduces to Avoid in polynomial time with stretch function , if there is a polynomial time algorithm which, given , outputs an Avoid instance so that whenever is a solution for , we have .
The key connection between random strings and explicit construction problems reducible to Range Avoidance is the following:
Observation 10 ([27]).
Say that for every , there is a polynomial time algorithm which, for every (resp. infinitely many ) outputs a string with . Then every explicit construction problem reducible to Avoid with stretch function is solvable in polynomial time .
Proof.
The definition of the reduction implies that there is a polynomial time algorithm , so that whenever we have . If is such that the algorithm constructing runs in time , then we observe that every string in the range of satisfies .
Recall that our approach to constructing random strings will consist of two steps. First, we will try only to construct a short list of strings, one of which is guaranteed to have complexity : we then concatenate the list to obtain a single string whose complexity degrades by a factor proportional to the length of the list. The second step (concatenation) is justified by the following standard claim:
Observation 11.
Let and let be their concatenation. Then for any time bound and we have .
For the first step (obtaining a short list of candidate solutions), we rely on the following observation:
Observation 12.
Let be a hitting set generator secure against with error . Then there exists a seed so that ; in other words the range of is a list solution for the set of strings with .
Proof.
At least half of bit strings have . On the other hand is computable in time with an oracle.
Hence, we have reduced the problem of constructing random strings to constructing hitting generators against with short seed length. The next section is dedicated to the construction of such a generator under plausible hardness assumptions. We will in fact produce a pseudorandom generator (despite only needing a hitting set generator).
3 PRGs for Uniform Classes with Near-Optimal Seed Length
The main goal in this section is to produce a pseudorandom generator computable in a time, or more generally time for some fixed constant , which is secure against uniform algorithms and has seed length significantly smaller than . We will phrase all of our assumptions/generator construction with respect to an arbitrary oracle in place of for the sake of generality.
We will consider here two qualitatively distinct classes of hardness assumption used to instantiate our generators, each parameterized by an oracle (typically we set ) and a constant :
Hypothesis 1 (Strong Assumption for , abbreviated ).
There is so that for all time constructible and all time constructible , the following holds: there is a sequence of strings so that the map is computable uniformly in time, but no machine running in time with an oracle and bits of advice can compute for more than finitely many . In the case , we only require the assumption to hold in the case .
We refer to as the “length bound” in the above.
Hypothesis 2 (Weak Assumption for parameters , , abbreviated ).
Let and . There is some constant and a language computable in which is not computable in on more than finitely many input lengths.
Note that translates to the assumption that requires circuit complexity with oracles, which is the standard regime in which polynomial time generators fooling with logarithmic seedlength can be constructed by known methods.
3.1 Fooling with Seed Length
The first step in our construction is to use classical hardness-randomness constructions to give pseudorandom generators which fool with seed length and runtime in the case is very large, under the appropriate hardness assumptions. Depending on which assumption we use we get a generator with different parameters:
Lemma 13.
Assume . Then for every time constructible there is a PRG computable uniformly in time which fools .
Lemma 14.
Assume , , . Then for any there is a PRG computable uniformly in time which fools and has seed length .
We will prove the first case (Lemma 13) here, and reserve Lemma 14 for the Appendix as it’s proof is similar. We require the following standard tool from complexity-theoretic pseudorandomness:
Theorem 15 (Black-Box Hardness-Randomness Connection [15]).
For each exists a constant and a uniform algorithm with the following behavior. On input and given oracle access to a function with , computes a function in time for such that for any with
there exists a circuit with oracle gates computing whose total size is at most .
Proof of Lemma 13.
We will invoke on some yet to be determined time bound and length bound , with respect to oracle and constant ; let be the implied constant guaranteed by the hypothesis (which does not depend on the choice of , ).
Next, invoke Theorem 15 with parameter , and let be the guaranteed constant from this Theorem. So there is a function which, given oracle access to a function with , runs in time with oracle calls to , and such that whenever distinguishes from uniform, there is a -oracle circuit of size computing .
Now we set and let , which are both time constructible, and use these in our specific invocation of as hinted previously. We then obtain an algorithm which, in time , computes a string , which cannot be computed by by any machine running in time with bits of advice. We then claim that, setting gives the required generator. By assumption it is computable in time . On the other hand if it were distinguished in time with bits of advice, we’d obtain a circuit computing every bit of of size with oracle gates that can be evaluated in time , hence overall we would be able to compute in time with bits of advice, provided . Recalling from our application Theorem 15 that , we have that and we are done.
3.2 Recursive Generator Construction
We give here our main construction of a generator fooling languages decidable in polynomial time with an oracle under a natural computational hardness assumption, specifically the assumptions , . Our construction will have seed iterated-logarithm type seed length length for infinitely many ; in particular it will work for all such that the number has time-bounded Kolmogorov complexity comparable to the seed length. Afterwords we show that a similar construction gives a generator fooling the same class with a slower runtime under the weak form hardness assumptions; the proof of this construction will be relegated to the appendix.
Theorem 16.
Let be any oracle, , and assume Hypothesis . Let be time constructible. There exists a generator computable uniformly in time, so that the following hold:
-
1.
For all ,
-
2.
For all , we have
for all but finitely many .
Proof.
Using our hardness assumption in combination with Lemma 13, for each time constructible bound bounded above by there is a constant and a pseudorandom generator computable in which fools with error . Using we may define a sequence of constants and functions as follows:
-
1.
, .
-
2.
Set , ,
Observe that .
Then for each , we have the generator defined for every input, which we abbreviate as . We use this family of generators to construct the generators promised in the theorem statement. In particular, we define for every the generator
We prove its security by induction, using the following stronger induction hypothesis for each :
-
1.
is computable in time with the number and additional bits as advice
-
2.
fools any , with error for all sufficiently large satisfying
For the base case, we know that which, by assumption, fools with error . Recall that , hence is computable in time (with no advice). Now assume the induction hypothesis up to . Let be given, sufficiently large, so that and and . Define , . By induction, is computable in time given the number as advice, and additional advice bits; hence is computable in : if we are given the number as advice, then we may determine using the number as advice which costs , after which we can compute the generator , and the language using additional bits of advice (together with some bits to specify the algorithm described in the current sentence). Under the assumption we may produce the number from an additional bits of advice, for a total advice cost , and the time of the additional operations (beyond the original computation of ) is bounded by .
Now, using the second part of our induction hypothesis, we determine that fools , in particular we have:
On the other hand, using the security of the generator on input length (which is sufficiently large), we have that fools , in particular:
Recalling that and combining the previous two inequalities using the triangle inequality we have:
which establishes the second condition in our inductive hypothesis. For the first, note that to compute , we need only recover the number from its shortest description in time , compute which takes time by induction, and finally compute which takes time ; overall this is bounded by .
At this point the theorem is proven; it remains to verify a few bounds:
-
1.
so that the error bound is as stated in the theorem.
-
2.
-
3.
The first and second hold trivially. For the last, we prove it by induction; more specifically we will show by induction that . In the base case, . We then have that
| (1) | |||
| (2) | |||
| (3) |
where the last step uses the induction hypothesis.
We highlight an important special case of the above:
Theorem 17.
Assume that for some and every time constructible , , there is a function computable uniformly in time which is not computable in even infinitely often. Then for every , there is a pseudorandom generator family with seed length which fools for all sufficiently large .
Proof.
We will set , , in Theorem 16. Observe that for all , we have since we may encode the number in binary.
If we rely instead on the weak form hypotheses we get:
Theorem 18.
Assume Hypothesis with , fixed constants, and a fixed constant. Then there is a pseudorandom generator with seed length and runtime that fools with error whenever .
As in the case of Lemma 14, we relegate the proof to the appendix since it uses essentially the same ideas as that of Theorem 16 and is in fact a bit simpler. As before we highlight an important special case, obtained immediately from Theorem 18 by setting , , :
Theorem 19.
Assume that there is some and a language which is not computable in on more than finitely many input lengths. Then for any there is a pseudorandom generator fooling with seedlength and quasipolynomial runtime.
3.3 Fooling Uniform Deterministic Time
For our application to construction of random strings, we will use the generator in the previous section with oracle setting . It is nonetheless natural to consider the implications of our generator in the case ; in this case we obtain a generator with seed length fooling under plausible hardness assumptions. However, we demonstrate here that in the regime of a deterministic distinguisher, once seed length is achieved for , we can reduce the seed length all the way down to an arbitrarily small super-constant value; more generally, we can fool with essentially optimal seed length for any time constructible . The discrepancy between the simplicity of the result here and the work required in the previous section is due to the key distinction that in this regime, the pseudorandom generator has enough resources to simulate the distinguishers it is trying to fool.
Theorem 20.
Let be time constructible. Assuming requires -size Boolean circuits for sufficiently large , there is a polynomial time computable generator which fools with error .
Proof.
Using the hardness assumption we obtain (uniformly in ) a pseudorandom generator with seed length for some constant . Setting and enumerating the outputs of our PRG, we obtain a list of bit strings which is a pseudorandom generator for . Let , let enumerate machines. Define the matrix with ; we may compute in time.
We now deterministically construct a set of size so that for every we have
If we can accomplish this, we output the condensed PRG whose range consists of the strings and has seed length and are finished. The algorithm to find the set is given in the next lemma.
Lemma 21.
There is a polynomial time algorithm which, given with , outputs a set of rows so that , and for every column , we have
Proof.
Using known results on the so-called “set balancing problem” [25], for any we may efficiently compute , so that
for every . In particular, provided , i.e. , we have that . Initializing , we may apply the above to obtain , , and for every
Hence
For a suitable choice of we will have for an arbitrarily small constant , and , hence
3.4 Optimality of the Seed Length
We include a simple (and basically standard) argument which indicates that the seed lengths obtained in the previous are essentially optimal with respect to the amount of advice they can handle:
Lemma 22.
Let be time-constructible, and an arbitrary family of generators which fools almost everywhere with error . Then .
In particular, if fools then for some inverse-total computable .
Proof.
Let , and consider the family of languages with depending only on the first bits of , and . For each , we have that , hence there exists a language so that for all sufficiently large, we have . On the other hand for every we have , hence fails to fool the language for all sufficiently large. Every language is decidable in deterministic time with bits of advice (we only need that , and hence are time constructible) so we get a contradiction if or .
In the case of deterministic time (Section 3.3) this implies that the stated construction is optimal for advice complexity . For our main construction fooling , we have no indication that to fool uniform languages (no advice) requires iterated-logarithmic seed length. However Theorem 16 is able to fool languages with advice complexity with seed length , so in this sense we are able to achieve the maximum advice complexity for the given seed length . It is consistent with our current knowledge that fooling completely uniform algorithms, even with an oracle, is achievable with arbitrarily slow-growing time constructible seed lengths, as we can achieve in the case of deterministic adversaries (under standard hardness assumptions).
4 Construction of Random Strings and Applications
In this section we discuss the applications of our main result to explicit constructions of random strings and circuit lower bounds.
4.1 Random String Construction Via PRG Concatenation
We start by applying our PRG constructions to give polynomial time explicit constructions of highly random strings. The first is a direct combination of our main PRG constructions with Observations 11 and 12 from the preliminaries. We start with the situation in which we use the strong form of our hardness assumptions, and hence Theorem 16 as our PRG:
Theorem 23.
Let , , and assume Hypothesis . Then for any constant , there is a polynomial time algorithm so that, for all with , is an -bit string with .
In the special case , , under a hardness assumption for singly exponential time bounds and oracles we obtain an algorithm which produces, for every , a string of length with for all .
More generally, a hardness assumption for exponential time bounds and oracles yields an efficient algorithm for computing strings with -oracle time-bounded Kolmogorov complexity, provided there is a time oracle algorithm which can test the complexity of an -bit string.
Proof.
This is a rather direct corollary of Theorem 16. Let be given with , and let be the seed length bound of the generator from Theorem 16 using the base time bound . Let be the largest integer such that ; clearly , hence we may apply Theorem 16 to compute (uniformly) a pseudorandom generator which fools on input length . By Observation 12 we conclude that some string in the range of has , hence the concatenation of all strings in the range of (padded at the end with s to bring the length up to ) will have complexity at least (recall Observation 11).
The second part of the theorem is the special parameter setting given by Theorem 17, and for the last sentence in the statement, we use the fact that Theorem 17 works for any oracle .
If we instead rely on the weaker form hardness assumptions and the associated PRG from Theorem 18 we obtain via the exact same argument:
Theorem 24.
Assume Hypothesis with , . Then for every , there is a time algorithm such that for all sufficiently large , is a string satisfying .
In particular, assuming that there is a language in which cannot be decided in infinitely often, there is a quasipolynomial time algorithm which outputs strings with .
More generally, a hardness assumption for exponential time bounds and oracles yields an efficient algorithm for computing strings with -oracle time-bounded Kolmogorov complexity, provided there is a time oracle algorithm which can test the complexity of an -bit string.
We now move on to a second class of constructions which achieves a far superior degree of randomness, but the cost is that our construction only works for infinitely many , and unlike the previous we have no control over which specific values of our construction succeeds on.
Theorem 25.
Assume Hypothesis . Then for any constant , there is a polynomial time algorithm so that, for infinitely many , is an -bit string with .
Proof.
Applying the assumption with Theorem 16 and the argument in the previous proof, we have an algorithm which prints, for all sufficiently large in , a list of strings, one of which must have . We now define a second algorithm , which, given , computes the largest value with , and prints the element of the list (if is larger than the length of the list, it does something arbitrary). For every , there is some which prints a string of length with .
As before, a modified variant of this construction can be done in the case of the alternative weak form hardness assumptions, where the cost of using the weaker kind of assumption is that the run time of the construction will degrade correspondingly; we leave the details to the interested reader as we will not use this variant in what follows.
4.2 Explicit Constructions Under Plausible Hardness Assumptions
By our main result, under reasonable hardness assumptions we are able to obtain explicit constructions of strings with time-bounded Kolmogorov complexity , where grows like an arbitrarily slow iterated logarithm function. We show here how to use this construction to build various important pseudorandom objects under our main hardness assumptions. We start with a list of fundamental explicit construction problems, focusing on those studied originally in [20], in addition to so-called “incompressible functions” considered in [3]. For background on substantial literature dedicated to these problems see [20, 3],
Definition 26 (List of Explicit Construction Problems).
-
-Rigidity: A matrix such that whenever , has at most nonzero entries, has rank at most
-
-Hard Boolean Functions: A Boolean function not computed by any Boolean circuit of size
-
-Incompressible Boolean Functions: A Boolean function which cannot be expressed in the form where is a circuit of size and is an arbitrary Boolean function
-
--Complexity: A communication matrix such that cannot be solved by space- communication protocols
-
-Bit-Probe Complexity: A data structure problem which requires space or time in the bit-probe model.
-
Ramsey: A graph with no clique or independent set of size
We then have:
Lemma 27.
Say that a Range Avoid instance has “stretch ” if it is of the form . The following uniform reductions exist from the problems in Definition 26 to Range Avoidance:
-
1.
-Rigidity reduces uniformly to Avoid with stretch
-
2.
-Hard Boolean Function reduces uniformly to Avoid with stretch
-
3.
-Incompressible Boolean Functions reduces uniformly to Avoid with stretch
-
4.
--Complexity reduces uniformly to Avoid with stretch
-
5.
-Bit-Probe Complexity reduces uniformly to Avoid with stretch
-
6.
Ramsey reduces uniformly to Avoid with stretch
Proof.
All of these proofs occur in [20], with the exception of Incompressible Functions. For this, if for a circuit of size and , we may represent the circuit (gate by gate) using bits via a standard encoding, and the function via its -bit truth table. From these it is clear how we may reproduce the function efficiently.
Combining this with our first generator construction from the previous section and Observation 10, we have:
Theorem 28.
Assume Hypothesis . Then for all with , we have the following:
-
1.
Polynomial time construction of matrices in which are
-rigid. In particular, for any such matrices achieve Valiant rigidity. -
2.
Boolean functions in with circuit complexity at least
-
3.
Boolean functions in which are incompressible
-
4.
Polynomial time construction of communication matrices which require space (for any setting of )
-
5.
Polynomial time construction of data structure problems which require space or time in the bit probe model (for any setting of )
In the important case , each of the constructions works for all and requires the hardness assumption only for singly-exponential time bounds.
We arrive at similar conclusions if we use the weak form hypothesis in combination with Theorem 18 instead, at the cost of a slow-down in our construction algorithms. We highlight below the results obtained by using the special case of Theorem 18 given in Theorem 19:
Theorem 29.
Assume that there is some and a language which is not computable in on more than finitely many input lengths (this is Hypothesis ). Then we have the following for all :
-
1.
Quasipolynomial time construction of matrices in which are -rigid.
-
2.
Boolean functions in with circuit complexity at least
-
3.
Boolean functions in which are incompressible
-
4.
Quasipolynomial time construction of communication matrices which require space (for any setting of )
-
5.
Quasipolynomial time construction of data structure problems which require space or time in the bit probe model (for any setting of )
The above results cover the vast majority of explicit construction problems considered originally in [20]. A notable exception is the construction of near-optimal Ramsey graphs; here the stretch given by Lemma 27 is to small to apply Theorem 23, and we must appeal instead to Theorem 25. Using this approach, we will be able to construct near-optimal Ramsey graphs infinitely often, provided we modify appropriately our encoding of graphs by strings to be well-defined on every input length:
Definition 30 (Ramsey Graphs of Every Length).
We associate every every string to a graph as follows. Let and set to be the greatest integer so that . Set , truncate to the first bits and interpret it as a graph on vertices canonically. Under this encoding, we say that encodes a Ramsey graph if contains no clique or independent set of size .
We then have:
Lemma 31.
Assume Hypothesis . Then there is a polynomial time algorithm which, for infinitely many , outputs a string so that is Ramsey.
Proof.
We claim that we may uniformly construct a Range Avoidance instance so that for all , contains every such that fails to be Ramsey. In particular, given we may compute efficiently the parameters , being the number of vertices and edges for the graphs ; we then apply Lemma 27 to obtain (uniformly) a Range Avoidance instance with stretch covering all strings which fail to encode Ramsey graphs. If fails to be Ramsey, then for some which fails to be Ramsey, hence we may uniformly construct an Avoid instance covering every non-Ramsey string with stretch , which is . We thus conclude that whenever fails to be Ramsey. Applying Theorem 25 and our hardness assumption yields the lemma.
4.3 Hardness Condensation
We also apply Theorem 23 to derive a new hardness condensation result. Hardness condensation is a phenomenon where a mild hardness assumption can be transformed into a much stronger hardness assumption of the same type. It was first studied in [4], who showed hardness condensation results for complexity classes with advice. It is more interesting to get hardness condensation for uniform classes, and indeed [20] achieves this for - he shows that requires exponential-size circuits iff it requires almost maximum-size circuits.
We obtain a new hardness condensation result for deterministic time without an -oracle. Here the condensation is with respect to non-uniform complexity - the stronger hardness assumption can handle almost a maximum non-trivial amount of non-uniformity, while the weaker hardness assumption only involves an exponential amount of non-uniformity.
Theorem 32.
The following ar equivalent:
-
1.
For every there exists so that, for all time constructible , we have is not contained even infinitely often in
-
2.
For every there exists so that, for all time constructible , we have is not contained even infinitely often in
Proof.
Clearly the second item implies the first. We show that the first item implies the second. Assuming (1), we have that every every there exists so that holds. Applying Theorem 24, for every there is so that for any time constructible , there is an algorithm running in time which, given , outputs a string of length which cannot be produced by any algorithm running in space with bits of advice 444We are using here the fact that can be simulated in . Interpreting this string as a function , we obtain a language in which is not contained even infinitely often in . Reparameterizing the time bounds yields (2).
4.4 Range Avoidance vs Uniform Range Avoidance
We use our results together with previous work to argue that the Range Avoidance problem becomes much more tractable for uniformly computable maps. While there is compelling complexity-theoretic evidence in various settings that Range Avoidance is intractable in general, our results show that Uniform Range Avoidance is tractable under plausible complexity-theoretic assumptions.
The first setting we consider is where the Range Avoidance instance is an arbitrary polynomial-size circuit , and we wish to solve Range Avoidance in polynomial time.
Ilango, Li and Williams [14] showed that in this setting, Range Avoidance is infeasible, under the standard assumption that and the plausible assumption that subexponentially-secure indistinguishability obfuscation exists [17].
Theorem 33 ([14]).
If and subexponentially-secure indistinguishability obfuscation exists, then for any , there is no polynomial-time algorithm which solves Range Avoidance for polynomial-size circuits mapping to bits.
Corollary 34.
Assume that , subexponentially-secure indistinguishability obfuscation, and Hypothesis . There is a constant such that for all , there is a polynomial-time algorithm which solves Range Avoidance on uniform circuits of size mapping bits to bits but no polynomial-time algorithm which solves Range Avoidance on all circuits of size mapping bits to bits.
Proof.
We show that we can take to be , for arbitrarily small . Indeed, for such and , the intractability of general Range Avoidance follows from Theorem 33, under the given assumptions. We show that uniform Range Avoidance can be solved by applying Theorem 23 and using the third assumption. Indeed, by this assumption, for every constant , there exists a polynomial-time algorithm which for all , on input outputs a length- string of complexity at least .
Let be a sequence of uniform circuits mapping bits to , where is any constant greater than . Here our notion of uniformity is standard -uniformity, but the argument can be adapted to work for more relaxed notions of uniformity. Note that any output has complexity at most for any , as we can first generate given using the uniformity condition and then generate from and by simulating . By running on input , we obtain a string of length and complexity at least , which is therefore a non-output of when is large enough.
We also consider the setting where the Range Avoidance instance is an arbitrary polynomial-size -oracle circuit , and we wish to solve Range Avoidance in polynomial time. The negative evidence for this is even stronger - the task is intractable under the standard assumption . Somewhat surprisingly, we show that uniform Range Avoidance is tractable under plausible assumptions even though the uniform Range Avoidance instance is allowed to use a -oracle.
Theorem 35 ([1]).
Suppose (resp. ). There is a constant such that for all , there is no polynomial-time (resp. quasipolynomial time) algorithm which solves Range Avoidance on all -oracle circuits of size .
In fact, [1] showed that computing an -incompressible string of length conditional on a given string is hard for polynomial time if , where is Kolmogorov space-bounded complexity. This is easily seen to imply the result above by considering the Range Avoidance instance which takes a -oracle program as input and evaluates it. The extension to quasipolynomial time bounds is not stated in [1] but follows immediately from the proof.
Corollary 36.
Assume , and that . Then there is a constant such that for all , there is a polynomial-time algorithm which solves Range Avoidance on uniform -oracle circuits of size but no polynomial-time algorithm which solves Range Avoidance on all -oracle circuits of size .
Proof.
5 Barriers to Improvements
In this section we present two barriers to improving our main results. First we show that the seed length achieved in our PRG from Section 3 is in some sense optimal for hardness/randomness approaches which use the same overall structure as our argument: specifically, we show that our argument in fact yields a construction of a special kind of seeded extractor with seed length , and prove unconditionally that this is the minimal achievable seed length in any such construction. Second we show that there is no relativizing argument that directly reduces the construction of -random strings to the construction of hard truth tables.
5.1 Multi-Reconstructive Extractors for CG Sources
The classical approach to turning hardness into randomness [26, 15] was famously shown by Trevisan [29] to be essentially equivalent to the task of constructing an explicit seeded extractor. Roughly speaking, Trevisan showed that any method for producing a pseudorandom generator using a hard boolean function whose correctness is proven in a sufficiently black box fashion must actually produce a seeded extractor, which treats the hard function as a high min-entropy source and the seed of the PRG as the seed in the seeded extractor.
In Section 3 a PRG was constructed against uniform algorithms (or more generally, algorithms with extremely small advice), whose seed length was much smaller than , using a different kind of hardness assumption applied across many different input lengths. In the section we cast such a hardness randomness construction in terms of certain seeded multi-source extractors, where each source corresponds to a hardness assumption used at a different input length. Through this connection we are able to show: (1) our construction immediately yields a seemingly new (and very simple) multi-source seeded extractor with interesting parameters, and (2) the iterated logarithmic (, ) seed length in our PRGs from Section 3 is necessary for any hardness/randomness tradeoff that uses the same general framework as ours, and in particular which uses “only many” hardness assumptions.
We start by recalling the proof of correctness of the PRG from Section 3, first in the case where seed length is achieved using a hardness assumption at two different input lengths. We first used a hard function whose truth table had size in order to get a PRG with seed length ; call this . We then bounded the run time of this PRG by some , and in the next step required a hard function of truth table size , which was hard for algorithms with running time ; in particular needed to be hard even in the regime where the computation of is easy. For this reason, we may roughly interpret this as saying, is a function (of much smaller input length) which is hard conditioned on . When we iterate the argument to reduce the seed length to , we require a sequence of functions , of smaller and small input lengths, with being hard for algorithms with enough running time to compute all of , i.e. which is hard conditioned on . This is naturally analogous to the following well-studied information-theoretic notion of a sequence of random variables, each with high min-entropy conditioned on the previous:
Definition 37 (CG Sources [9]).
A random variable is said to be a CG source with blocks and entropy sequence , if for every and every ,
In the case , this means .
A CG extractor is a function which can produce nearly uniform randomness from a CG source and an independent uniform seed:
Definition 38 (Seeded CG Extractors).
We say that a function is a CG extractor for entropy sequence and error if, for every CG source supported on with entropy sequence , we have that is -close to in TV distance, where is the uniform distribution on (independent of the CG source) and is uniform on . The parameter is called the seed length of the extractor.
We will now show that our PRG construction can be viewed as giving a specific kind of explicit reconstructive extractor for CG sources (with a seed), in the same sense that the standard hardness randomness constructions can be viewed as reconstructive single source seeded extractors via Trevisan’s connection. We define such reconstructive seeded CG extractors as follows:
Definition 39 (Multi-Reconstructive Extractors).
Let . We say that is a multi-reconstructive extractor for entropy sequence and error if there are -time algorithms , each taking oracle access to some , with the following behavior
-
1.
Each takes strings and an advice string and outputs a string in (in the case the only input to is ).
-
2.
For every , if
then there exists and some such that
Note that in the case , we obtain the standard definition of reconstructive single source extractors. As shown in the single-source case by Trevisan [29], we may observe that any multi-reconstructive extractor is automatically a CG extractor with similar parameters:
Lemma 40.
If is a muti-reconstructive extractor for entropy sequence and error , then it is a CG extractor for entropy sequence and error .
Proof.
Let be a CG source with entropy sequence . The deviation of from uniform is bounded by , where is the maximum over all distinguishers of the probability that any of the reconstruction procedures associated with succeed on a random sample from the source. We bound the probability for each reconstruction procedure separately and take a union bound over . For a fixed and any fixing of , there are at most values in that can output as we range over ; if then the probability that lies in this set is at most . We can now recast the central step in our main construction in Theorem 16 in the language of reconstructive extractors:
Theorem 41.
Let be single source extractors , , so that is a reconstructive single-source extractor for min entropy and error and is computable in polynomial time. Say that for each , we have . Define by induction on , with , and . Then is a multi-reconstructive extractor for entropy sequence and error .
Proof.
We prove by induction on that is a multi-reconstructive extractor with error . By definition it is true for . Now, say that is a reconstructive extractor for entropy sequence and error . So there are reconstruction procedures taking oracle access to a function so that, given any distinguishing with error , there exists so that produces for some . We then consider ; our reconstruction procedures for will be as they were for , and for we use the construction procedure for the single source extractor . Let be given. For any such that the first reconstruction procedures all fail to reconstruct a symbol of from its prefix using , we know that fools with error . Now define the test (recall ), with . Then, for a uniformly random with have that with probability in the range . Observe that is efficiently computable with oracle access to and the values , since each of the extractors are efficiently computable. Thus, using the reconstruction procedure for , either succeeds in reconstructing using oracle access to , the previous inputs , and bits of advice, or else we must have that fools with error , and hence the overall construction fools with error , completing the proof.
Using the above in combination with explicit families of single-source reconstructive extractors (e.g. Trevisan’s extractor) we get:
Corollary 42.
For any and fixed constants , , there is an explicit multi-reconstructive extractor for entropy sequence and error , where:
Moreover, for each , and hence by Lemma 40 this construction is also a CG extractor with the same parameters stated above.
We now prove that the seed length achieved above is optimal as a function of the output length , regardless of how we choose the source lengths :
Theorem 43.
Say is a CG extractor for min entropy sequence , , and error . Then .
To prove this we rely on the following lemma:
Lemma 44.
Let with . Then there is a CG source with entropy sequence supported on , with .
Proof of Lemma 44.
We prove by induction on ; the case is trivial. Let be given; for each define , . So Define the set by
Let ; if then we must have
| (4) | |||
| (5) | |||
| (6) |
and we get a contradiction. Hence, we know that . For every , we also know that has size at least , so by induction there is a CG source with min entropies , supported on . We then construct our overall distribution as follows: sample uniformly, then sample and output . We know that the first coordinate of this distribution has min entropy , and that the remaining entries are a CG source with min entropy sequence conditioned on any fixing of the first coordinate; so overall is a CG source with entropy sequence .
Proof of Theorem 43.
When , is of the form , so that is close to uniform on ; in this case we clearly require . For , we will show how to choose some and construct from a second extractor using only sources, so that is a CG extractor for entropy sequence , the same error (), and at most exponentially larger seed length . This will yield the theorem by induction.
We will set to be any index such that ; if we can find such an index, we construct by simply moving input of into the seed, i.e.
Clearly for any choice of , remains a CG extractor for the contracted entropy sequence and the same error as . To argue that for some we rely on Lemma 44 above. Let consist of all functions which depend only on the first bits. So . For every , there is some such that
hence there exists a fixed , so that for the set , we have . We then apply Lemma 44 with to conclude that there is a CG source supported on with min entropy sequence , . If for all this will contradict the correctness of the extractor , since will be distinguished with error by the test . Hence there exists such that , i.e. . Recalling that , we get which concludes the proof.
Note that the proof yields more than just a lower bound on :
Observation 45 (Follows from Proof of Theorem 43).
In any construction achieving the optimal seed length , there must be some smallest source length with , then a second smallest source length of , and so on with the longest source having length on the order
Hence our construction uses essentially the only possible sequence of source lengths which can be made to achieve a an optimal seed length of .
5.2 No Relativizing Reduction from Randomness to Hard Truth Tables
In [20] it is shown that if we are permitted the use of an oracle in our explicit construction algorithms we have the following appealing equivalence:
Theorem 46.
The following are equivalent:
-
1.
There is a polynomial time -oracle algorithm constructing strings with
-
2.
There is a polynomial time -oracle algorithm constructing truth tables with hardness (i.e. ).
If this kind of equivalence could be shown without the aid of an oracle, it would supersede all of the main results in this paper. The proof of Theorem 46 is relativizing, and more specifically it gives a black-box reduction from the problem of constructing high Kolmogorov-complexity strings to constructing hard truth tables (the nontrivial direction). We give some indication here that an oracle is necessary for such an argument to work.
Observe that if has circuit complexity , then for and interpreting as an -bit string, we have ; this is because evaluating a circuit on an input takes at most time, and hence reconstructing from a circuit computing it takes at most quadratic time. Hence if there were an analogue to Theorem 46 in the polynomial time regime, it would give, in particular, a reduction from constructing strings of large to strings of large complexity.
We show here that there is no such reduction which is “black box” in the same sense as Theorem 46. We first need to define the suitable notion of a black box reduction:
Definition 47.
Let be fixed constants and large. A black box reduction from -construction to -construction is an algorithm, given access to some oracle , which has the following behavior:
-
1.
There is a procedure which, given strings with , makes additional queries to and outputs a string
-
2.
For any oracle , if are strings such that for each , outputs a string with .
Theorem 48.
For fixed constants, , there is no black box reduction from -Avoid to -Avoid.
Proof.
Consider the Range Avoidance instance , defined by where is some standard pairing function. Clearly for any oracle and any such that we have . Hence if we can find an oracle and supply strings so that for each , but outputs a string in then we are done. Initially we fix the value of the oracle to be zero on all strings of length at most . We then conclude that for any , we may determine a string so that is already forced by the current information about the oracle; this holds since a machine running in time cannot access the any unfixed part of the oracle. We will fix these strings for the remainder of the proof.
Now, consider the algorithm with the first arguments fixed, as a function of the remaining arguments . By the correctness of , for any extension of the partially defined oracle and any valid solutions for these remaining arguments to , will find a solution to the Range Avoidance instance . On the other hand, observe that if are chosen uniformly at random, then for any oracle the probability that any fails to have is bounded by since for each such we have . Hence we obtain a randomized query procedure, making queries to the oracle , which outputs a solution to Avoid on with failure probability . Since we have not fixed the behavior of above input length , can take on any value , and is thus an arbitrary oracle-presented Range Avoidance instance. It then remains only to show that a randomized query algorithm making queries to an Avoid instance with stretch cannot succeed with probability as high as .
To obtain this randomized query lower bound, we appeal to Yao’s lemma, and consider the best success probability of a deterministic -query algorithm on a uniformly random instance . For each possible sequence of queries we argue the probability the supplied answer is incorrect conditioned on this query sequence ocurring is is at least . This holds trivially, since , and hence a random extending a fixed sequence of mappings has probability at least of hitting any fixed string in .
References
- [1] Scott Aaronson, Harry Buhrman, and William Kretschmer. A qubit, a coin, and an advice string walk into a relational problem. In Venkatesan Guruswami, editor, 15th Innovations in Theoretical Computer Science Conference, ITCS 2024, January 30 to February 2, 2024, Berkeley, CA, USA, volume 287 of LIPIcs, pages 1:1–1:24. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2024. doi:10.4230/LIPICS.ITCS.2024.1.
- [2] Eric Allender. The complexity of complexity. In Adam R. Day, Michael R. Fellows, Noam Greenberg, Bakhadyr Khoussainov, Alexander G. Melnikov, and Frances A. Rosamond, editors, Computability and Complexity – Essays Dedicated to Rodney G. Downey on the Occasion of His 60th Birthday, volume 10010 of Lecture Notes in Computer Science, pages 79–94. Springer, 2017. doi:10.1007/978-3-319-50062-1_6.
- [3] Benny Applebaum, Sergei Artemenko, Ronen Shaltiel, and Guang Yang. Incompressible functions, relative-error extractors, and the power of nondeterministic reductions (extended abstract). In Proceedings of the 30th Conference on Computational Complexity, CCC ’15, pages 582–600, Dagstuhl, DEU, 2015. Schloss Dagstuhl – Leibniz-Zentrum für Informatik. doi:10.4230/LIPICS.CCC.2015.582.
- [4] Joshua Buresh-Oppenheim and Rahul Santhanam. Making hard problems harder. In 21st Annual IEEE Conference on Computational Complexity (CCC 2006), 16-20 July 2006, Prague, Czech Republic, pages 73–87. IEEE Computer Society, 2006. doi:10.1109/CCC.2006.26.
- [5] Lijie Chen, Shuichi Hirahara, and Hanlin Ren. Symmetric exponential time requires near-maximum circuit size. In Bojan Mohar, Igor Shinkar, and Ryan O’Donnell, editors, Proceedings of the 56th Annual ACM Symposium on Theory of Computing, STOC 2024, Vancouver, BC, Canada, June 24-28, 2024, pages 1990–1999. ACM, 2024. doi:10.1145/3618260.3649624.
- [6] Lijie Chen and Roei Tell. Simple and fast derandomization from very hard functions: eliminating randomness at almost no cost. In STOC, pages 283–291. ACM, 2021. doi:10.1145/3406325.3451059.
- [7] Yeyuan Chen, Yizhi Huang, Jiatu Li, and Hanlin Ren. Range avoidance, remote point, and hard partial truth table via satisfying-pairs algorithms. In Barna Saha and Rocco A. Servedio, editors, Proceedings of the 55th Annual ACM Symposium on Theory of Computing, STOC 2023, Orlando, FL, USA, June 20-23, 2023, pages 1058–1066. ACM, 2023. doi:10.1145/3564246.3585147.
- [8] Yilei Chen and Jiatu Li. Hardness of range avoidance and remote point for restricted circuits via cryptography. In Bojan Mohar, Igor Shinkar, and Ryan O’Donnell, editors, Proceedings of the 56th Annual ACM Symposium on Theory of Computing, STOC 2024, Vancouver, BC, Canada, June 24-28, 2024, pages 620–629. ACM, 2024. doi:10.1145/3618260.3649602.
- [9] Benny Chor and Oded Goldreich. Unbiased bits from sources of weak randomness and probabilistic communication complexity. In 26th Annual Symposium on Foundations of Computer Science (sfcs 1985), pages 429–442, 1985. doi:10.1109/SFCS.1985.62.
- [10] Eldon Chung, Alexander Golovnev, Zeyong Li, Maciej Obremski, Sidhant Saraogi, and Noah Stephens-Davidowitz. On the randomized complexity of range avoidance, with applications to cryptography and metacomplexity. Electron. Colloquium Comput. Complex., TR23-193, 2023. URL: https://eccc.weizmann.ac.il/report/2023/193.
- [11] Karthik Gajulapalli, Alexander Golovnev, Satyajeet Nagargoje, and Sidhant Saraogi. Range avoidance for constant depth circuits: Hardness and algorithms. In Nicole Megow and Adam D. Smith, editors, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, APPROX/RANDOM 2023, September 11-13, 2023, Atlanta, Georgia, USA, volume 275 of LIPIcs, pages 65:1–65:18. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2023. doi:10.4230/LIPICS.APPROX/RANDOM.2023.65.
- [12] Venkatesan Guruswami, Xin Lyu, and Xiuhan Wang. Range avoidance for low-depth circuits and connections to pseudorandomness. In Amit Chakrabarti and Chaitanya Swamy, editors, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, APPROX/RANDOM 2022, September 19-21, 2022, University of Illinois, Urbana-Champaign, USA (Virtual Conference), volume 245 of LIPIcs, pages 20:1–20:21. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2022. doi:10.4230/LIPICS.APPROX/RANDOM.2022.20.
- [13] Shuichi Hirahara. Meta-computational average-case complexity: A new paradigm toward excluding heuristica. Bull. EATCS, 136, 2022. URL: http://bulletin.eatcs.org/index.php/beatcs/article/view/688.
- [14] Rahul Ilango, Jiatu Li, and R. Ryan Williams. Indistinguishability obfuscation, range avoidance, and bounded arithmetic. In Barna Saha and Rocco A. Servedio, editors, Proceedings of the 55th Annual ACM Symposium on Theory of Computing, STOC 2023, Orlando, FL, USA, June 20-23, 2023, pages 1076–1089. ACM, 2023. doi:10.1145/3564246.3585187.
- [15] Russell Impagliazzo and Avi Wigderson. P = bpp if e requires exponential circuits: derandomizing the xor lemma. In Proceedings of the Twenty-Ninth Annual ACM Symposium on Theory of Computing, STOC ’97, pages 220–229, New York, NY, USA, 1997. Association for Computing Machinery. doi:10.1145/258533.258590.
- [16] Russell Impagliazzo and Avi Wigderson. Randomness vs. time: De-randomization under a uniform assumption. In 39th Annual Symposium on Foundations of Computer Science, FOCS ’98, November 8-11, 1998, Palo Alto, California, USA, pages 734–743. IEEE Computer Society, 1998. doi:10.1109/SFCS.1998.743524.
- [17] Aayush Jain, Huijia Lin, and Amit Sahai. Indistinguishability obfuscation from well-founded assumptions. In Samir Khuller and Virginia Vassilevska Williams, editors, STOC ’21: 53rd Annual ACM SIGACT Symposium on Theory of Computing, Virtual Event, Italy, June 21-25, 2021, pages 60–73. ACM, 2021. doi:10.1145/3406325.3451093.
- [18] Robert Kleinberg, Oliver Korten, Daniel Mitropolsky, and Christos H. Papadimitriou. Total functions in the polynomial hierarchy. In James R. Lee, editor, 12th Innovations in Theoretical Computer Science Conference, ITCS 2021, January 6-8, 2021, Virtual Conference, volume 185 of LIPIcs, pages 44:1–44:18. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2021. doi:10.4230/LIPICS.ITCS.2021.44.
- [19] Adam R. Klivans and Dieter van Melkebeek. Graph nonisomorphism has subexponential size proofs unless the polynomial-time hierarchy collapses. In Proceedings of the Thirty-First Annual ACM Symposium on Theory of Computing, STOC ’99, pages 659–667, New York, NY, USA, 1999. Association for Computing Machinery. doi:10.1145/301250.301428.
- [20] Oliver Korten. The hardest explicit construction. In 62nd IEEE Annual Symposium on Foundations of Computer Science, FOCS 2021, Denver, CO, USA, February 7-10, 2022, pages 433–444. IEEE, 2021. doi:10.1109/FOCS52979.2021.00051.
- [21] Oliver Korten and Toniann Pitassi. Strong vs. weak range avoidance and the linear ordering principle. In 65th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2024, Chicago, IL, USA, October 27-30, 2024, pages 1388–1407. IEEE, 2024. doi:10.1109/FOCS61266.2024.00089.
- [22] Ming Li and Paul M. B. Vitányi. An Introduction to Kolmogorov Complexity and Its Applications, 4th Edition. Texts in Computer Science. Springer, 2019. doi:10.1007/978-3-030-11298-1.
- [23] Zeyong Li. Symmetric exponential time requires near-maximum circuit size: Simplified, truly uniform. In Bojan Mohar, Igor Shinkar, and Ryan O’Donnell, editors, Proceedings of the 56th Annual ACM Symposium on Theory of Computing, STOC 2024, Vancouver, BC, Canada, June 24-28, 2024, pages 2000–2007. ACM, 2024. doi:10.1145/3618260.3649615.
- [24] Zhenjian Lu and Igor C. Oliveira. Theory and applications of probabilistic kolmogorov complexity. Bull. EATCS, 137, 2022. URL: http://bulletin.eatcs.org/index.php/beatcs/article/view/700.
- [25] Rajeev Motwani, Joseph (Seffi) Naor, and Moni Naor. The probabilistic method yields deterministic parallel algorithms. Journal of Computer and System Sciences, 49(3):478–516, 1994. 30th IEEE Conference on Foundations of Computer Science. doi:10.1016/S0022-0000(05)80069-8.
- [26] Noam Nisan and Avi Wigderson. Hardness vs randomness. Journal of Computer and System Sciences, 49(2):149–167, 1994. doi:10.1016/S0022-0000(05)80043-1.
- [27] Hanlin Ren, Rahul Santhanam, and Zhikun Wang. On the range avoidance problem for circuits. In 63rd IEEE Annual Symposium on Foundations of Computer Science, FOCS 2022, Denver, CO, USA, October 31 – November 3, 2022, pages 640–650. IEEE, 2022. doi:10.1109/FOCS54457.2022.00067.
- [28] Ronen Shaltiel and Christopher Umans. Simple extractors for all min-entropies and a new pseudorandom generator. J. ACM, 52(2):172–216, March 2005. doi:10.1145/1059513.1059516.
- [29] Luca Trevisan. Extractors and pseudorandom generators. J. ACM, 48(4):860–879, July 2001. doi:10.1145/502090.502099.
- [30] Christopher Umans. Pseudo-random generators for all hardnesses. In Proceedings of the Thiry-Fourth Annual ACM Symposium on Theory of Computing, STOC ’02, pages 627–634, New York, NY, USA, 2002. Association for Computing Machinery. doi:10.1145/509907.509997.
Appendix A Proof of Lemma 14
We prove here Lemma 14, which we restate below for convenience:
Lemma (Lemma 14, Restated).
Assume Hypothesis , , . Then for every time constructible there is a PRG computable uniformly in time which fools and has seed length .
For this we require a strengthening of Theorem 15 due to Umans [30] (following a similar result of Shaltiel-Umans [28]) given as follows:
Theorem 49 ([30]).
There is a fixed universal constant so that the following holds. Let be any time-constructible “hardness” parameter, . On input and given oracle access to a function , computes a function in time for some time constructible , , such that for any with
there exists a circuit with oracle gates computing whose total size is at most .
Proof of Lemma 14.
Assuming Hypothesis , we conclude that for some , there is a language in which is not in even infinitely often. Let , and define by where are the time constructible bounds guaranteed in Theorem 49. By Theorem 49, if is distinguished by some then there is an oracle circuit of total size computing ; hence if a distinguisher for exists which is computable in time with an oracle and bits of advice for infinitely many , then is decidable with an oracle in time with bits of advice for infinitely many .
Let for some constant . We then define our final generator in terms of by a simple reparameterization of input lengths; for , we determine the least such that and and set (truncating the output length of as necessary). We want to show that fools time algorithms using an oracle and bits of advice, and runs in time with seed length . For the first point, observe that for any running in time with bits of advice, we obtain a computation of running in time time with bits of advice, which contradicts our hardness assumption provided and which both hold by construction.
On the other hand the runtime and seedlength of are bounded by and respectively, so it remains to bound the growth rate of . Recall that we chose to be the least integer such that and , where . For the first point, we have hence we can bound by the least integer satisfying , i.e. , so it suffices here to take . On the other hand, to have it suffices to have , i.e. , so setting suffices. Hence in the end we obtain a runtime of and seed length .
We will use a slight generalization of Lemma 14 that follows directly by padding (the difference compared to the previous lemma is merely that we allow a slightly more general upper bound on ):
Corollary 50.
Assume Hypothesis , , and let be a fixed constant. Then for every time constructible there is a PRG computable uniformly in time which fools and has seed length .
Proof.
Using Lemma 14, from our hardness assumption we get a generator fooling with runtime and seed length . On input length , we choose some such that and apply our generator on input length ; clearly we may use this generator for input length as well, by considering a language as depending on only the first bits. It suffices to take , in which case the seed length as a function of is and the runtime is .
We are now ready to prove Theorem 18:
Theorem (Theorem 18, Restated).
Assume Hypothesis with , , fixed constants, and a fixed constant. Then there is a pseudorandom generator with seed length and runtime that fools with error whenever .
Proof.
Let , , be fixed constants and let . We prove by induction on that there is a pseudorandom generator which runs in time and fools with error and has seed length , provided the input length satisfies .
In the case we may apply Lemma 14 directly. Now, say that the generator is given, computable in time with and fooling with error . Let be a time constructible function such that ; we may set . Now, let be the generator guaranteed by Corollary 50 for time bound ; so fools , runs in time , and has error and seed length . We then set to be the generator with seed length given by . The run time of is bounded by some constant times the sum of the run times of the two constituent generators, which overall is bounded by . Let be decided by a machine; recall that for some . Define given by ; since fools with error , we have that . On the other hand is decidable in time with bits of advice provided is of the form : we use the advice for deciding , together with bits of advice describing the number , and advice to specify the code for and the procedure explained herein. Hence fools to within error , so overall we must have that fools to within error . It remains to bound the seed length and runtime of . The seed length is , and , so overall the seed length is bounded by . On the other hand the runtime is dominated by .
